Inlay

We've been looking at how to compare and cluster large numbers of genomes, such as those in large isolate databases such as AllTheBacteria, and metagenome assemblies (e.g. SPIRE, MGnify). On a combined dataset of 5.6 million assemblies, we can now cluster/dereplicate everything in under a day!

🧬 New preprint! We clustered 5.6 million bacterial genomes into genomically cohesive units (GCUs) 500× faster than existing tools. (In just 14 hours, 16.5 GB RAM using 48 CPUs). 🦠🐙Meet gemsparcl 💎✨! www.biorxiv.org/content/10.6...