Multilingual 🤝reasoning 🤝 test-time scaling 🔥🔥🔥
New preprint!
@yongzx.bsky.social has all the details 👇
Can English-finetuned LLMs reason in other languages?
Short Answer: Yes, thanks to “quote-and-think” + test-time scaling. You can even force them to reason in a target language!
But:
🌐 Low-resource langs & non-STEM topics still tough.
New paper: arxiv.org/abs/2505.05408
🚨LLM safety research needs to be at least as multilingual as our models.
What's the current stage and how to progress from here?
This work led by @yongzx.bsky.social has answers! 👇
It’s been two years since cross-lingual jailbreaks were first discovered. How far has the multilingual LLM safety research field advanced? 🤔
📏 Our comprehensive survey reveals that there is still a long way to go.
Julia Kreutzer
Julia Kreutzer
Cohere Labs
It’s been two years since cross-lingual jailbreaks were first discovered. How far has the multilingual LLM safety research field advanced? 🤔
📏 Our comprehensive survey reveals that there is still a long way to go.
Cohere Labs
arxiv.org
Reasoning capabilities of large language models are primarily studied for English, even when pretrained models are multilingual. In this work, we investigate to what extent English reasoning finetunin...
📣 New paper!
We observe that reasoning language models finetuned only on English data are capable of zero-shot cross-lingual reasoning through a "quote-and-think" pattern.
However, this does not mean they reason the same way across all languages or in new domains.
[1/N]
📣 New paper!
We observe that reasoning language models finetuned only on English data are capable of zero-shot cross-lingual reasoning through a "quote-and-think" pattern.
However, this does not mean they reason the same way across all languages or in new domains.
[1/N]