Happening now! @pjox.bsky.social and I are giving a talk for @eleutherai.bsky.social on CommonLID, a community-driven web domain evaluation dataset for language identification. Join here: discord.gg/aYy3Se7Q?eve...
Paper: arxiv.org/abs/2601.18026
@commoncrawl.bsky.social
The original open science AI research collective. We started the open source LLM movement and have been pushing the boundaries of science ever since. | 33740 members