Single task, lightweight, short-context bp res. profile models often perform on par or outperform current large, multi task, long context models on counterfactual prediction. Much to do to improve.
Bonus: robust, efficient interpretation of syntax
Great collab with @jengreitz.bsky.social lab.
📢 new preprint alert: So so excited to share our analysis on the impact of common and rare variants on single-cell gene expression in blood, using WGS and scRNA-seq data from nearly 2,000 individuals and 5.4m cells as part of TenK10K phase 1 🧬 www.medrxiv.org/content/10.1...
🧵👇 (1/n)
We are the Stegle Lab: A bioinformatics group advancing computational methods to study molecular variations and their impact on phenotypes. We are jointly hosted at the German Cancer Research Center (@dkfz.bsky.social) and the European Molecular Biology Laboratory (@embl.org) in Heidelberg, Germany.
Are you using any of our factor models, such as MOFA? 🛵
You might’ve found it challenging to tailor them to your specific use cases - not anymore!
Introducing MOFA-FLEX: a flexible, modular factor analysis framework designed for customizable modeling across diverse multi-omics data scenarios. 1/n
Very cool paper from Eddie Park and Yi Xing studying the relationship between intron retention QTLs and expression QTLs. Predictably, genetically regulated intron retention can cause changes in gene expression via nonsense-mediated decay (NMD). www.biorxiv.org/content/10.1...
Our ChromBPNet preprint out!
www.biorxiv.org/content/10.1...
Huge congrats to Anusri! This was quite a slog (for both of us) but we r very proud of this one! It is a long read but worth it IMHO. Methods r in the supp. materials. Bluetorial coming soon below 1/
An interesting "what have we been doing all these years?" result from this paper is how sub-optimal the widely-used uniform sampling scheme can be (cluster all @50%, sample from all clusters equally). In contrast, strategies that account for the relative differences in cluster size improve val loss
New work by Andy Dahl and Michal Sadowski on using GxE to study genetics of drug response now out in Cell Genomics www.cell.com/cell-genomic...
Are you a postgraduate student interested in protein modelling and drug discovery?
We have an exciting opportunity to join our team at GSK for a 6-9 months internship, working on an ambitious cross-department research project. Apply before March 14th!
www.linkedin.com/jobs/view/41...
What do GWAS and rare variant burden tests discover, and why?
Do these studies find the most IMPORTANT genes? If not, how DO they rank genes?
Here we present a surprising result: these studies actually test for SPECIFICITY! A 🧵on what this means... (🧪🧬)
www.biorxiv.org/content/10.1...
Posted 11:13:48 PM. Site Name: UK - Hertfordshire - Stevenage, Heidelberg - OfficePosted Date: Feb 28 2025We create a…See this and similar jobs on LinkedIn.
Despite extensive mapping of cis-regulatory elements (cREs) across cellular contexts with chromatin accessibility assays, the sequence syntax and genetic variants that regulate transcription factor (T...
Sadowski et al. propose a framework to study the genetics of response to commonly prescribed drugs in large biobanks. They quantify the heritability of response to statins, metformin, warfarin, and methotrexate, and identify associated genes. Their analysis also shows the importance of accounting for drug use in genetic risk prediction.
Intron retention is a type of alternative splicing in which introns remain unspliced in mature RNA transcripts. In order to explore the landscape and consequences of genetically regulated intron reten...
Standard genome-wide association studies (GWAS) and rare variant burden tests are essential tools for identifying trait-relevant genes. Although these methods are conceptually similar, we show by anal...