Now published in NSMB!
Paper: doi.org/10.1038/s415...
Full PDF: rdcu.be/fhBtI
Overview of additions since the preprint👇 (1/5)
This work introduces the Runs N’ Poses dataset for benchmarking deep learning methods on the protein–ligand complex prediction task. It shows that current methods rely on memorization, challenging the...
A fun little idea that worked surprisingly well, using a structure-informed yet structure-independent alphabet for de novo protein design: www.biorxiv.org/content/10.6...
🧵(1/n)
Meanwhile, Boltz-2 was released (cutoff 1 June 2023), using ~2 extra years of PDB data vs others. This additional data does not seem to improve generalization with the current architecture (a) and also significantly decreases the number of difficult systems in the RnP (b). (3/5)
Interestingly, Boltz1x shows similar success rate compared to Boltz-1, but has a boost in PB-valid predictions (F), with all predicted systems passing the Minimum Distance To Protein check, which seems to be a major issue for other methods (E). (4/5)
Thanks again to the co-authors @jeeberhardt.bsky.social, @torstenschwede.bsky.social, @ninjani.bsky.social and all collaborators! It’s great to see our work being widely used to benchmark and improve new methods (e.g., OpenFold3, Isomorphic Labs, Pearl), helping advance the field of PLI prediction!
Search with TEA 🍵 Against Many!
→ On the web: pickybinders.org/tea/steam
→ Locally: github.com/PickyBinders...
Feedback welcome!
Peter Škrinjar
Excited to share our latest preprint evaluating AlphaFold3, Boltz-1, Chai-1 and Protenix for predicting protein-ligand interactions, featuring our newly introduced benchmark dataset 🌹Runs N’ Poses🌹!
www.biorxiv.org/content/10.1...
🧵👇 (1/n)
Fresh from bioRxiv our latest work introducing The Embedded Alphabet (TEA), a powerful new representation for protein sequences obtained by discretising ESM2 embeddings into 20 characters.
Pre-print: www.biorxiv.org/content/10.1...
🧵👇(1/n)
Deep learning has driven major breakthroughs in protein structure prediction, however the next critical advance is accurately predicting how proteins interact with other molecules, especially small mo...
www.biorxiv.org
Detecting remote homology with speed and sensitivity is crucial for tasks like function annotation and structure prediction. We introduce a novel approach using contrastive learning to convert protein...
Fresh from bioRxiv our latest work introducing The Embedded Alphabet (TEA), a powerful new representation for protein sequences obtained by discretising ESM2 embeddings into 20 characters.
Pre-print: www.biorxiv.org/content/10.1...
🧵👇(1/n)
Peter Škrinjar
We compared rigid docking with varying prior knowledge to AF3 on single-ligand systems in RnP. Unrealistic scenario (redocking) shows cases in lower bins aren’t challenging from a physics perspective, while AF3-dock-ideal indicates holo conformation prediction for those cases remains difficult.(2/5)
www.biorxiv.org
Detecting remote homology with speed and sensitivity is crucial for tasks like function annotation and structure prediction. We introduce a novel approach using contrastive learning to convert protein...
I'm excited to share *Stoic*, a method for fast and accurate protein complex stoichiometry prediction directly from sequence. Preprint: www.biorxiv.org/content/10.6... 🧵👇(1/10)
Lorenzo Pantolini
Peter Škrinjar
Meet Stoic from @daniil-litvinov.bsky.social and @ninjani.bsky.social: embeddings to predict stoichiometry of protein complexes from sequence fast and accurately 🧬🧩💻🤩
www.biorxiv.org/content/10.6...