Language use is language change—every new conversation that is signed or spoken and every new sentence that is written is an incremental amendment in the social contract that binds a language community together.
We first apply this on Serbian in Latin and Cyrillic characters, and Chinese in simplified and traditional characters. In both cases, we outperform a prompting baseline for script control in many cases.
✨New paper✨
We find script (e.g. Cyrillic, Latin) to be a linear direction in the activation space of Whisper, enabling transliteration at test-time by adding such script directions to the activations — producing e.g. Cyrillic Japanese transcriptions.