Language use is language changeโevery new conversation that is signed or spoken and every new sentence that is written is an incremental amendment in the social contract that binds a language community together.
We then find that our Latin and Cyrillic directions can be added to activations in other languages at test time to induce transliteration. Romanization examples below:
We first apply this on Serbian in Latin and Cyrillic characters, and Chinese in simplified and traditional characters. In both cases, we outperform a prompting baseline for script control in many cases.
โจNew paperโจ
We find script (e.g. Cyrillic, Latin) to be a linear direction in the activation space of Whisper, enabling transliteration at test-time by adding such script directions to the activations โ producing e.g. Cyrillic Japanese transcriptions.
๐MaiNLP is turning 3 today!๐๐ฅณ Weโve grown a lot since @barbaraplank.bsky.social started this group with nothing but three aspiring researches and a hand-drawn sign on the door. Huge thanks to all the amazing people who have joined or visited us since. Hereโs to many more years of exciting research!๐
It's publication day for Gesture: A Slim Guide
If you have been wanting to think about gesture in your own research, bring it into your teaching or connect with the field of Gesture Studies, this is for you. It's under 50k words and has a nifty glossary too.
But how well does it perform? Up to almost 75% accuracy under a zero-shot setting, suggesting that romanization direction may be broadly similar across languages. Furthermore, our induced transliterations often appear more faithful to the pronunciation compared to the deterministic ground truth.
๐๏ธ Excited to announce the 1st Workshop on Bridging NLP and Public Opinion Research at COLM 2025, Oct 10th in Montreal ๐จ๐ฆ
As LLMs reshape public discourse and research, collaboration between NLP and Public Opinion Research (POR) is more vital than ever #NLPOR Submit by June 23๐
๐ tinyurl.com/nlpor25
Our method is simple. We collect activations in source and target script, take their difference to isolate script info, then add to activations at test time to induce script change. Similar to the king - man + woman = queen vector arithmetic.
Paper link here: arxiv.org/abs/2601.02906
Joint work with @juice500ml.bsky.social, @kalvinchang.bsky.social, Ming-Hao Hsu, @florian-eichin.com, Zhizheng Wu, Alane Suhr, @mhedderich.bsky.social, David Harwath, @davidrmortensen.bsky.social, and @barbaraplank.bsky.social!