Inlay

Fresh from bioRxiv our latest work introducing The Embedded Alphabet (TEA), a powerful new representation for protein sequences obtained by discretising ESM2 embeddings into 20 characters. Pre-print: www.biorxiv.org/content/10.1... 🧵👇(1/n)

Detecting remote homology with speed and sensitivity is crucial for tasks like function annotation and structure prediction. We introduce a novel approach using contrastive learning to convert protein...