Massive thanks to everyone involved in this study, @chundru.bsky.social @drarwood.bsky.social @carolinefwright.bsky.social @laferrat.bsky.social @rnbeaumont.bsky.social @mnweedon.bsky.social @kash-a-patel.bsky.social @drghawkes.bsky.social @timfrayling.bsky.social Liza Darrous and Aurelie Kamoun
In summary, we show that without conditioning, many significant rare variant associations in large-scale WGS association analyses are driven by LD and haplotype structure. We developed a federated conditioning framework which prioritises independent single variants and aggregates.
This is because rare variants and aggregates are confounded by linkage and haplotype structure, just like common variant GWAS. Previous methods which rely on reference panels or LD-matrices to determine independence break down or don't scale to very rare variants in WGS data
We also identify an example (C8B rare intronic splicing variants in the PCSK9-C8B-ANGPLT3 locus) where long-range haplotypes and missing data make it difficult to confidently draw conclusions
We identify allelic series of variants associated with reduced LDL-C, including loss-of-function variants in DNAJC13 and variants in the 3-prime untranslated region of LDLR
Excited to share my first preprint on federated conditional analysis of rare single variant and aggregate association tests across six genetically-inferred ancestry groups in All of Us and UK Biobank doi.org/10.64898/202...
By using individual level data across biobanks in an iterative conditional meta-analysis of LDL-C, we show that only 4.3% of rare single variant and 6.9% of rare variant aggregate associations at study-wide significance were conditionally independent