//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...









Loading...
This reframes the folding problem as: what determines the burial of the hard-to-predict core residues? The core identity score is available on GitHub with a Google Colab notebook. Try it on your own structures! 8/8 Link: github.com/agrigas115/core_identity_score
2mo
(1/n) Does basal stem cell division orientation regulate skin stratification and tissue mechanics? And can tissue mechanics feed back to control division orientation? In our new preprint, we use a 3D vertex model to explore this @manningresearch.bsky.social @somiealo.bsky.social
1mo
Alex Grigas
www.biorxiv.org
Excited to highlight a new preprint about mechanical contributions to tissue homeostasis, from the Manning group in collaboration with the amazing Carien Niessen and Sara Wickstrom @sarawickstrom.bsky.social labs, spearheaded by Dr. Somiealo Azote: www.biorxiv.org/content/10.6...
4mo
Rajendra Singh Negi
Can hydrophobicity scales identify the correct core? The textbook picture says hydrophobic collapse drives folding. But ~23% of incorrectly folded models have cores that are more hydrophobic than the native fold. Current scales can't solve core identity by maximization. 7/8
To fairly compare, we measure bits/residue by accounting for label entropy and send random subsets of true labels. Core identity reaches ρ=0.9 at just 0.4 bits/residue, versus 0.68 for contacts and 0.58 for 3Di. It's the most efficient encoding we tested. 4/8
What about predicting from sequence alone? We trained a lightweight predictor on ESM2 embeddings for burial and compared to ESM2-predicted contacts. Predicting burial from sequence gives a better LDDT correlation than using contacts (ρ=0.82 vs 0.75), and combining the two doesn't help. 5/8
Core identity scoring is robust to ~10% random label noise. But sequence-based predictors don't make random errors, they fail on hydrophobic residues with high label entropy, which are precisely the residues that matter most for fold quality. 6/8
How much information does it take to fold a protein? Not much, if you use the right information! We find that residue burial, a binary label of core vs surface, encodes a protein's fold highly efficiently and even improves ESM2's structure representation. 1/8 www.biorxiv.org/content/10.6...
To test this, we encode ~24,000 CASP structural models using different representations - contact maps (N(N-1)/2 pairwise binary labels) and core identity (N binary labels) for example - and ask: how well does each predict the accuracy of the backbone (LDDT)? 2/8
Surprisingly, matching just the N binary burial labels to the experimental structure predicts LDDT nearly as well (ρ=0.94) as matching the full N(N-1)/2 contact map (ρ=0.95). A single label per residue rivals a pairwise representation. 3/8
2mo
2mo
2mo
2mo
2mo
2mo
2mo
Manning Research Group
Protein structure is controlled by a high-dimensional energy landscape, which is a function of all of the atomic coordinates of the protein. Can this landscape be accurately described by a low-dimensional representation? We find that residue core identity, a binary N-dimensional encoding indicating whether each of the N amino acids in a protein is buried in the core or not, can predict the protein's backbone conformation more efficiently than all other representations that we tested. Core identity is 4 times more efficient than previous estimates of the bits per residue needed to encode a protein's native fold, 2 times more efficient than the Cα contact map, and 1.5 times more efficient than the machine-learned embeddings from FoldSeek's 3Di. Even when the folded structure is unavailable, predicting each residue's burial from sequence yields a more accurate estimate of fold quality than predicting pairwise contacts from the same sequence information. Thus, this work emphasizes that the problem of determining a protein's native fold can be re-framed as predicting each residue's core identity. ### Competing Interest Statement The authors have declared no competing interest. Chan Zuckerberg Initiative (United States), 2023-329572 NIH, T32GM145452
www.biorxiv.org
Residue burial encodes a protein's fold
Alex Grigas
Alex Grigas
Alex Grigas
Alex Grigas
Alex Grigas
Alex Grigas
Alex Grigas
Predict a protein structure's backbone accuracy (average LDDT) by comparing its core residues to the core predicted using ESM2 embeddings - agrigas115/core_identity_score
github.com
GitHub - agrigas115/core_identity_score: Predict a protein structure's backbone accuracy (average LDDT) by comparing its core residues to the core predicted using ESM2 embeddings