keju’s improved sensitivity does not come with a corresponding trade-off in calibration. In fact, keju is also better calibrated than previous methods when benchmarking on negative controls with labels masked. (6/n)
Happy to see this work out now in Genome Biology! Check out the final version here for your FACS DMS needs: link.springer.com/article/10.1...
link.springer.com
Deep mutational scanning (DMS) coupled with fluorescence-activated cell sorting (FACS) provides a high-throughput method to link genetic variants with quantitative molecular phenotypes. Analysis of th...
I’d like to thank Adam Zahm and @justingenglish.bsky.social, who were great collaborators through this entire process, as well as my mentors @hjp.bsky.social and Sriram Sankararaman! (n/n)
My preprint on keju, a statistical tool for Massively Parallel Reporter Assay (MPRA) data, is out! keju improves sensitivity, calibration, and reliability over previous methods by closely modeling important uncertainty sources in MPRAs. Check it out: www.biorxiv.org/content/10.6... (1/n)
Thanks primarily to these assumptions, keju shows substantial improvements in recovery of ground truth effects compared to previous methods (in simulated data). (5/n)
Another axis is batch structure. Different batches can be linked to different experimental conditions, treatments, or perturbations, and show substantial variation. Accordingly, keju further improves sensitivity through RNA batch-specific uncertainty estimation. (4/n)
Also I forgot to add this but I am actively looking for postdocs in the general space of functional genomics, statistics, and genetics! Please reach out if you have open positions or interest starting around fall or winter of this year! (n+1/n)
Jerome
One axis is modality. DNA counts in MPRAs are primarily functions of transfection, while RNA counts are downstream of many noisy biological processes. keju improves sensitivity by ignoring DNA count uncertainty, instead focusing on uncertainty estimation in the RNA counts. (3/n)
www.biorxiv.org
MPRAs offer exciting opportunities to link genetics to transcription through paired DNA and RNA observations. However, MPRAs vary in experimental design and batch structure, which each introduce different axes of uncertainty that complicate inference. (2/n)
Check out our new preprint on Lilace, a statistical tool for scoring FACS-based deep mutational scanning experiments! Lilace directly models the shift between variant fluorescence distributions and provides score uncertainty estimates to better assess reliability and reproducibility. (1/3)