Cleanifier is a fast and memory-frugal tool to remove host contamination from microbial sequence data.
We build a pangenome gapped k-mer index using a probabilistic Cuckoo filter (
doi.org/10.48550/arX..., @Alenex 2026) for low memory requirements and fast queries.
Jens Zentgraf
We are excited that our paper "Cleanifier: Contamination removal from microbial sequences using spaced seeds of a human pangenome index" is now published at Bioinformatics (doi.org/10.1093/bioi...).
You can find it at gitlab (gitlab.com/rahmannlab/c...) or install it via PyPI or Bioconda.
Fast Set Operations for Compact k-mer Sets https://www.biorxiv.org/content/10.64898/2026.05.24.727514v1