Excited to share our new paper in @JNCI_Now! We integrated GWAS data from 11 solid cancers with ~1,500 cell type annotations to pinpoint WHERE in the body cancer risk variants actually act. A thread ๐
@peter-kraft.bsky.social
๐ฃ ๐๐โ๐ซ๐ ๐ก๐ข๐ซ๐ข๐ง๐ ๐ ๐ฉ๐จ๐ฌ๐ญ๐๐จ๐ ๐ข๐ง ๐ฉ๐จ๐ฉ๐ฎ๐ฅ๐๐ญ๐ข๐จ๐ง & ๐ฌ๐ญ๐๐ญ๐ข๐ฌ๐ญ๐ข๐๐๐ฅ ๐ ๐๐ง๐๐ญ๐ข๐๐ฌ! ๐๐๐งฌ
If youโre finishing a PhD (or know someone who is) and want to work on complex trait biology + ๐๐๐๐-๐ค๐๐๐๐ ๐๐๐๐๐๐ก, read on ๐
โ[G]enerative models generate numbers which are thought to represent real possibilites in the world. It is very tempting to change model inputs and then look at the result. A safe way to describe this is โin silico perturbationโ, but many use the word โcounterfactual.โ This is very dangerous.โ
(Quotes lightly edited for space. Any errors on me. Do read the original post.)
โMany valid, tested, robust, and clearly generalisable generative models will not correspond to the real world if you do this in silico perturbation. The presence of a robust and generalisable association model absolutely does not mean that interventions can be modeled.โ
Kodama: genotype compression and matrix multiplication leveraging genetic relatedness www.biorxiv.org/content/10.6...
(Even โin silico perturbationโ might be too strong. Maybe simply โsimulationโ? Generative models capture more complexity [tuned to training data] than your typical biostats simulation [โnow weโre really going to get fancy and throw in 2nd order interactionsโ], but theyโre still simulacra.)