Senior Research Engineer with the Common Crawl Foundation.
(languages ∪ tech) in Dùn Èideann
Laurie Burchell
Loading...
Love this widget by Daan van Esch: daanvanesch.nl/langid/index... - compare language ID predictions in your browser!
I'm learning Rust right now and I'd recommend the experimental version of the Rust Book: rust-book.cs.brown.edu. The quizzes help you understand that you don't actually understand ownership 👍
Happening now! @pjox.bsky.social and I are giving a talk for @eleutherai.bsky.social on CommonLID, a community-driven web domain evaluation dataset for language identification. Join here: discord.gg/aYy3Se7Q?eve...
Paper: arxiv.org/abs/2601.18026
@commoncrawl.bsky.social
"The true genius is a mind of large general powers, accidentally determined to some particular direction."
~ Samuel Johnson
Laurie Burchell
Laurie Burchell
Laurie Burchell
The Empty City
RSVP and join speakers @very-laurie.bsky.social and @pjox.bsky.social from the Common Crawl Foundation and Kostis Saitas Zarkias and Robert Pugh from Mozilla Data Collective for a truly hands-on session.
Thursday, June 4th
6 PM CEST | 12 PM ET | 9 AM PDT
Register via Zoom: zoom.us/meeting/regi...
Language identification still proves to be a challenging task, especially for web data. In collaboration with @mlcommons.org @eleutherai.bsky.social @jhu.edu and 97 community members, we created CommonLID, a new benchmark for LangID for 100+ languages!