//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfilePosts









Loading...
** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
📣 ACL 2026 SRW Direct Submissions are open! 🗓️ Deadline: March 18, 2026 (AoE) Submit your work via OpenReview: openreview.net/group?id=acl... We can’t wait to see your contributions—good luck! 💪✨ #ACL2026 #SRW #NLProc
The CfP for the SRW at ACL 2026 is out!
I'm part of this! There's also a paper: arxiv.org/abs/2503.10267
NAACL was a blast 💥 Presented the findings of our Shared Tasks at @americasnlp.bsky.social, had a chance to reconnect with old friends, make new ones, and get excited about research I'm passionate about. #NAACL25
Feb 28, 2025
3mo
5mo
Last week I was at @aclmeeting.bsky.social ! Lots of friendly faces, great work and amazing art ✨️ We presented HPLT v2 datasets together with @very-laurie.bsky.social 🎉 Read our paper here: aclanthology.org/2025.acl-lon...
See you next week at EMNLP! We will be presenting our work: Scaling Low-Resource MT via Synthetic Data Generation with LLMs 📍 Poster Session 13 📅 Fri, Nov 7, 10:30-12:00 - Hall C 📖 Check it out! arxiv.org/abs/2505.14423 @helsinki-nlp.bsky.social @cambridgenlp.bsky.social @emnlpmeeting.bsky.social
Mar 17, 2025
May 5, 2025
Come to Helsinki for the 18th MT Marathon! Sponsored by EAMT @ufal-cuni.bsky.social
That's a wrap for @nodalida.bsky.social ! Short, nice and intense. I presented our work on efficient MT @helsinki-nlp.bsky.social within the #HPLT project⚡️
🚀 Just added our HPLT fast translation models to a new TranslateLocally repository! Translate on your own machine—fast, private, and easy. shorturl.at/R2vzw 20+ models for diverse languages—learn more about them next week at @nodalida.bsky.social!
10mo
7mo
Mar 18, 2025
Mar 5, 2025
Feb 24, 2025
Barry Haddow
ACL SRW 2026
Ona de Gibert
Ona de Gibert
Laurie Burchell
Ona de Gibert
Ona de Gibert
Ona de Gibert
Ona de Gibert
Ona de Gibert
HPLT - High Performance Language Technologies
A space that combines petabytes of natural language data with large-scale model training
hplt-project.org
** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
We investigate the potential of LLM-generated synthetic data for improving low-resource Machine Translation (MT). Focusing on seven diverse target languages, we construct a document-level synthetic co...
arxiv.org
Scaling Low-Resource MT via Synthetic Data Generation with LLMs
Feb 28, 2025
A space that combines petabytes of natural language data with large-scale model training
hplt-project.org
HPLT - High Performance Language Technologies
🎉 We’re excited to announce the ACL 2026 Student Research Workshop (SRW) website is live! 🌐 acl2026-srw.github.io brings together students across NLP to present research, receive mentorship, and engage with the global research community. 🧵 Key details ⬇️ #ACL2026 #NLProc
Barry Haddow
5mo
ACL SRW 2026
Call for participation: We just opened the registration for this year's MT Marathon in August in Helsinki, Finland: blogs.helsinki.fi/language-tec..., featuring: - Ayodele Awokoya - Wilker Aziz - Marta Costa-Jussa - Barry Haddow - Amit Moryosse - Sara Papi - Jörg Tiedemann - Marco Turchi
Mar 18, 2025
Helsinki NLP
blogs.helsinki.fi