Performance: multithreading delivers a 17x speedup, and new parallel GZIP compression makes v0.40 5x faster on I/O than v0.39. At 40 threads, a 78M read-pair plant dataset completes in ~11 min, on par with the highly optimised C++ tools.
New trimming steps: HEADCROP/TAILCROP for UMI & barcode removal, AVGQUAL for whole-read quality filtering, BASECOUNT for post-trimming length enforcement. Plus auto PHRED detection and built-in paired-end integrity validation.
Accuracy benchmark vs. fastp, RabbitTrim, BBDuk, Skewer & Cutadapt: Trimmomatic left only 14 residual adapter reads on the plant dataset. fastp: 810. Cutadapt: 23,874. And on the human dataset: 0. The palindrome algorithm still leads.
#OpenSource #Genomics
Marie Bolger closes the session at the #SOLRUB2025 with a presentation about the application of Helixer to the annotation of the #Solanaceae genomes.
Our paper on Trimmomatic v0.40 is officially published in Bioinformatics! A decade after the original, the ~50,000-times-cited NGS trimmer gets a major overhaul. Here's what's new π§΅
doi.org/10.1093/bioi...
#Bioinformatics #NGS
πThank you all for joining #BioHackEU25! A special thanks to our online and overseas participants π staying up late hacking through different projects with us. πIt has been a great week!! And we look forward to seeing you next year for #BioHackEU26 in Barcelona, Spain!π₯³
The @denbi.bsky.social crew & friends from Germany at #BioHackEU25!
β¨ We're halfway through #BioHackEU25!
This afternoon, all projects will present their π― goals and challenges theyβve encountered and seek expertise from participants.
Keyword of this year: #Data, #Research, #FAIR and #Metadata
Hello, Bluesky! Weβre AgBioData, a consortium of agricultural databases, researchers, and curators working to make genomic, genetic, and breeding data FAIR.
Follow us for updates on:
π¬ Data standards
π Tools & best practices
πΎ Community news
π’ Sustainability
#AgBio #OpenScience #FAIRdata #AgBioData
AbstractMotivation. Trimmomatic is a widely adopted tool for preprocessing high-throughput sequencing data, particularly from Illumina platforms. Since its