The training details for VHHBERT are light, but seems like there is no clustering at all. Anyway, the finding suggests that these models all represent CDRH3 residues as a linear combination of residue type features and position features, which is incredibly underpowered