** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
📣 ACL 2026 SRW Direct Submissions are open!
🗓️ Deadline: March 18, 2026 (AoE)
Submit your work via OpenReview: openreview.net/group?id=acl...
We can’t wait to see your contributions—good luck! 💪✨
#ACL2026 #SRW #NLProc
The CfP for the SRW at ACL 2026 is out!
I'm part of this! There's also a paper: arxiv.org/abs/2503.10267
NAACL was a blast 💥 Presented the findings of our Shared Tasks at @americasnlp.bsky.social, had a chance to reconnect with old friends, make new ones, and get excited about research I'm passionate about. #NAACL25
Last week I was at @aclmeeting.bsky.social ! Lots of friendly faces, great work and amazing art ✨️ We presented HPLT v2 datasets together with @very-laurie.bsky.social 🎉 Read our paper here: aclanthology.org/2025.acl-lon...
See you next week at EMNLP!
We will be presenting our work: Scaling Low-Resource MT via Synthetic Data Generation with LLMs
📍 Poster Session 13
📅 Fri, Nov 7, 10:30-12:00 - Hall C
📖 Check it out! arxiv.org/abs/2505.14423
@helsinki-nlp.bsky.social @cambridgenlp.bsky.social @emnlpmeeting.bsky.social
Come to Helsinki for the 18th MT Marathon! Sponsored by EAMT @ufal-cuni.bsky.social
That's a wrap for @nodalida.bsky.social ! Short, nice and intense. I presented our work on efficient MT @helsinki-nlp.bsky.social within the #HPLT project⚡️
🚀 Just added our HPLT fast translation models to a new TranslateLocally repository! Translate on your own machine—fast, private, and easy. shorturl.at/R2vzw
20+ models for diverse languages—learn more about them next week at @nodalida.bsky.social!
A space that combines petabytes of natural language data with large-scale model training
hplt-project.org
** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
We investigate the potential of LLM-generated synthetic data for improving low-resource Machine Translation (MT). Focusing on seven diverse target languages, we construct a document-level synthetic co...
🎉 We’re excited to announce the ACL 2026 Student Research Workshop (SRW) website is live!
🌐 acl2026-srw.github.io
brings together students across NLP to present research, receive mentorship, and engage with the global research community.
🧵 Key details ⬇️
#ACL2026 #NLProc
Barry Haddow
ACL SRW 2026
Call for participation: We just opened the registration for this year's MT Marathon in August in Helsinki, Finland: blogs.helsinki.fi/language-tec..., featuring:
- Ayodele Awokoya
- Wilker Aziz
- Marta Costa-Jussa
- Barry Haddow
- Amit Moryosse
- Sara Papi
- Jörg Tiedemann
- Marco Turchi