📄 Paper 2 (submitted to IS): "Self-Supervised Speech Models Encode Phonetic Context via Position-dependent Orthogonal Subspaces"
We further show how sequences of phone(me)s can be encoded, i.e., contextualize, in a single S3M frame.
arxiv.org/abs/2603.12642