//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...






Loading...
A huge thanks to my fantastic co-authors: Lorenzo Proietti, @zouharvi.bsky.social, Roberto Navigli, and @kocmitom.bsky.social. 👏 #AI #NLProc #Evaluation
🤖 We release our best models, sentinel-src-24 and sentinel-src-25! Use them to build more robust evaluations, filter data, or explore applications in other areas such as curriculum learning.
📄 Paper: arxiv.org/abs/2508.10175 🤗 Models: huggingface.co/collections/... 💻 Code: github.com/zouharvi/tra...
💡Our solution: increase benchmark difficulty! What if we could predict in advance which texts are hard to translate? We introduce Translation Difficulty Estimation as a novel task to automatically identify challenging texts for MT systems.
🔍 Our most surprising finding? LLM-based methods struggle with this task, performing worse than even simple heuristics like sentence length. In contrast, our specialized, trained models are the clear winners.
In our paper, we: 1️⃣ Define the task and introduce Difficulty Estimation Correlation to evaluate difficulty estimators. 2️⃣ Benchmark a wide range of methods establishing the first SOTA. 3️⃣ Demonstrate their effectiveness in building more challenging test sets automatically.
9mo
9mo
9mo
9mo
9mo
9mo
Our new #EMNLP2025 paper is out: "Estimating Machine Translation Difficulty"! 🚀 Are today's #MachineTranslation systems flawless? When SOTA models all achieve near-perfect scores on standard benchmarks, we hit an evaluation ceiling. How can we tell their true capabilities and drive future progress?
Stefano
9mo
Stefano
Stefano
Stefano
Stefano
Stefano