Inlay

Profile

WiAIR is dedicated to celebrating the remarkable contributions of female AI researchers from around the globe. Our goal is to empower early career researchers, especially women, to pursue their passion for AI and make an impact in this exciting field.

Women in AI Research - WiAIR

🎧 Listen to the episode! 🎬 YouTube: www.youtube.com/watch?v=3QXH... 🎙️ Spotify: open.spotify.com/episode/1aWC... 🍎 Apple: podcasts.apple.com/ca/podcast/1... 📄 Paper: arxiv.org/pdf/2601.11778 #WiAIR #MultilingualAI #LLMs #MachineTranslation #NLProc

Neural MT metrics show the strongest alignment with downstream performance. But the proxy has limits: some specialized benchmarks, including MGSM and INCLUDE, show weaker or more variable correlations. Task-specific evaluation remains necessary. (4/5 🧵)

Translation quality is measured on 3 parallel corpora using 7 MT metrics, both lexical and neural, then systematically correlated with downstream benchmark scores across languages and tasks. (3/5 🧵)

The paper evaluates 14 LLMs across 5 model families on 9 multilingual benchmarks spanning knowledge, reading comprehension, NLI, commonsense & mathematical reasoning, truthfulness, and regional knowledge. (2/5 🧵)

✨ Can translation quality serve as a scalable proxy for multilingual LLM evaluation? In our latest #WiAIR episode, we host Dr. Saadia Gabriel (@skgabrie.bsky.social) to discuss "Translation as a Scalable Proxy for Multilingual Evaluation". (1/5 🧵)

1mo

Women in AI Research - WiAIR