Introducing Global PIQA, a new multilingual benchmark for 100+ languages. This benchmark is the outcome of this year’s MRL shared task, in collaboration with 300+ researchers from 65 countries. This dataset evaluates physical commonsense reasoning in culturally relevant contexts.
—> it’s paradigm replacement, not task automation, that causes widespread job displacement
from substack.com/home/post/p-...
cool analogy for AI & jobs:
- when ATMs came, the number of bank tellers rose, bc ATMs lowered the cost of running bank branches, so more branches opened
- but in the 2010s, the number of bank tellers plummetted, bc mobile banking made branches unnecessary
Whoa the #WMT25 results on MT Evaluation are wild! ChrF outperforms pretty much all neural metrics 🙀
the paper www2.statmt.org/wmt25/pdf/20...
in Vienna for ACL, presenting Tulun, a system for low-resource in-domain translation, using LLMs
Tuesday @ 4pm
Working w 2 real use cases: medical translation into Tetun 🇹🇱 & disaster relief speech translation in Bislama 🇻🇺