//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...
PhD Student @HelsinkiNLP / Low-resource, Machine Translation, Knowledge Distillation, Multilinguality
Ona de Gibert









Loading...
The CfP for the SRW at ACL 2026 is out!
📣 ACL 2026 SRW Direct Submissions are open! 🗓️ Deadline: March 18, 2026 (AoE) Submit your work via OpenReview: openreview.net/group?id=acl... We can’t wait to see your contributions—good luck! 💪✨ #ACL2026 #SRW #NLProc
Come to Helsinki for the 18th MT Marathon! Sponsored by EAMT @ufal-cuni.bsky.social
That's a wrap for @nodalida.bsky.social ! Short, nice and intense. I presented our work on efficient MT @helsinki-nlp.bsky.social within the #HPLT project⚡️
** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
🚀 Just added our HPLT fast translation models to a new TranslateLocally repository! Translate on your own machine—fast, private, and easy. shorturl.at/R2vzw 20+ models for diverse languages—learn more about them next week at @nodalida.bsky.social!
See you next week at EMNLP! We will be presenting our work: Scaling Low-Resource MT via Synthetic Data Generation with LLMs 📍 Poster Session 13 📅 Fri, Nov 7, 10:30-12:00 - Hall C 📖 Check it out! arxiv.org/abs/2505.14423 @helsinki-nlp.bsky.social @cambridgenlp.bsky.social @emnlpmeeting.bsky.social
Last week I was at @aclmeeting.bsky.social ! Lots of friendly faces, great work and amazing art ✨️ We presented HPLT v2 datasets together with @very-laurie.bsky.social 🎉 Read our paper here: aclanthology.org/2025.acl-lon...
I'm part of this! There's also a paper: arxiv.org/abs/2503.10267
NAACL was a blast 💥 Presented the findings of our Shared Tasks at @americasnlp.bsky.social, had a chance to reconnect with old friends, make new ones, and get excited about research I'm passionate about. #NAACL25
5mo
3mo
Mar 18, 2025
Mar 5, 2025
Feb 28, 2025
Feb 24, 2025
7mo
10mo
Mar 17, 2025
May 5, 2025
A space that combines petabytes of natural language data with large-scale model training
hplt-project.org
We investigate the potential of LLM-generated synthetic data for improving low-resource Machine Translation (MT). Focusing on seven diverse target languages, we construct a document-level synthetic co...
arxiv.org
HPLT - High Performance Language Technologies
Scaling Low-Resource MT via Synthetic Data Generation with LLMs
Ona de Gibert
ACL SRW 2026
Ona de Gibert
Ona de Gibert
Barry Haddow
Ona de Gibert
Ona de Gibert
Ona de Gibert
Laurie Burchell
Ona de Gibert
🎉 We’re excited to announce the ACL 2026 Student Research Workshop (SRW) website is live! 🌐 acl2026-srw.github.io brings together students across NLP to present research, receive mentorship, and engage with the global research community. 🧵 Key details ⬇️ #ACL2026 #NLProc
** New parallel data set ** . We've just released HPLT v2.0, a parallel data set of 50 languages paired with English, 380M sentence pairs in total. Extracted from the Internet Archive and Common Crawl hplt-project.org/datasets/v2.0
5mo
Call for participation: We just opened the registration for this year's MT Marathon in August in Helsinki, Finland: blogs.helsinki.fi/language-tec..., featuring: - Ayodele Awokoya - Wilker Aziz - Marta Costa-Jussa - Barry Haddow - Amit Moryosse - Sara Papi - Jörg Tiedemann - Marco Turchi
Feb 28, 2025
Mar 18, 2025
A space that combines petabytes of natural language data with large-scale model training
hplt-project.org
HPLT - High Performance Language Technologies
ACL SRW 2026
blogs.helsinki.fi
Barry Haddow
Helsinki NLP