//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfileReplies









Loading...
I'm part of this! There's also a paper: arxiv.org/abs/2503.10267
Mar 17, 2025
** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
Feb 28, 2025
** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
Feb 28, 2025
A space that combines petabytes of natural language data with large-scale model training
HPLT - High Performance Language Technologies
Laurie Burchell
hplt-project.org
📣 ACL 2026 SRW Direct Submissions are open! 🗓️ Deadline: March 18, 2026 (AoE) Submit your work via OpenReview: openreview.net/group?id=acl... We can’t wait to see your contributions—good luck! 💪✨ #ACL2026 #SRW #NLProc
Barry Haddow
A space that combines petabytes of natural language data with large-scale model training
hplt-project.org
3mo
HPLT - High Performance Language Technologies
The CfP for the SRW at ACL 2026 is out!
Barry Haddow
5mo
Come to Helsinki for the 18th MT Marathon! Sponsored by EAMT @ufal-cuni.bsky.social
Last week I was at @aclmeeting.bsky.social ! Lots of friendly faces, great work and amazing art ✨️ We presented HPLT v2 datasets together with @very-laurie.bsky.social 🎉 Read our paper here: aclanthology.org/2025.acl-lon...
See you next week at EMNLP! We will be presenting our work: Scaling Low-Resource MT via Synthetic Data Generation with LLMs 📍 Poster Session 13 📅 Fri, Nov 7, 10:30-12:00 - Hall C 📖 Check it out! arxiv.org/abs/2505.14423 @helsinki-nlp.bsky.social @cambridgenlp.bsky.social @emnlpmeeting.bsky.social
NAACL was a blast 💥 Presented the findings of our Shared Tasks at @americasnlp.bsky.social, had a chance to reconnect with old friends, make new ones, and get excited about research I'm passionate about. #NAACL25
🚀 Just added our HPLT fast translation models to a new TranslateLocally repository! Translate on your own machine—fast, private, and easy. shorturl.at/R2vzw 20+ models for diverse languages—learn more about them next week at @nodalida.bsky.social!
That's a wrap for @nodalida.bsky.social ! Short, nice and intense. I presented our work on efficient MT @helsinki-nlp.bsky.social within the #HPLT project⚡️
Mar 18, 2025
10mo
7mo
May 5, 2025
Feb 24, 2025
Mar 5, 2025
ACL SRW 2026
We investigate the potential of LLM-generated synthetic data for improving low-resource Machine Translation (MT). Focusing on seven diverse target languages, we construct a document-level synthetic co...
arxiv.org
Scaling Low-Resource MT via Synthetic Data Generation with LLMs
Ona de Gibert
Ona de Gibert
Ona de Gibert
Ona de Gibert
Ona de Gibert
Ona de Gibert
Ona de Gibert
🎉 We’re excited to announce the ACL 2026 Student Research Workshop (SRW) website is live! 🌐 acl2026-srw.github.io brings together students across NLP to present research, receive mentorship, and engage with the global research community. 🧵 Key details ⬇️ #ACL2026 #NLProc
5mo
ACL SRW 2026
Call for participation: We just opened the registration for this year's MT Marathon in August in Helsinki, Finland: blogs.helsinki.fi/language-tec..., featuring: - Ayodele Awokoya - Wilker Aziz - Marta Costa-Jussa - Barry Haddow - Amit Moryosse - Sara Papi - Jörg Tiedemann - Marco Turchi
Mar 18, 2025
blogs.helsinki.fi
Helsinki NLP