An interesting "what have we been doing all these years?" result from this paper is how sub-optimal the widely-used uniform sampling scheme can be (cluster all @50%, sample from all clusters equally). In contrast, strategies that account for the relative differences in cluster size improve val loss
Diego del Alamo
ProGen3 is out and shows a cool result: as PLMs get larger, they can successfully generate across a broader cross-section of the protein fold space www.profluent.bio/showcase/pro...