While we're waiting for the next issue of the CL Journal, you can get early access to upcoming articles here: direct.mit.edu/coli/online-... #NLProc #CLJournal
How Can We Effectively Expand the Vocabulary of LLMs with 0.01GB of Target Language Text? This article explores a very important question for low-resource languages by experimenting with various techniques across 10 languages. A must-read at:https://doi.org/10.1162/COLI.a.581 @gucciiiii.bsky.social
To assess whether multilingual NLP performs well across languages, we need to evaluate it on all world languages.That is not feasible.This paper implements two sampling methods from linguistic typology and provides a Python package to facilitate this: doi.org/10.1162/COLI... @andreashhp.bsky.social
CL Journal has over 50 years of history in the field. While the field is moving, some of the articles from decades ago provide perspective and still relevant content. What do you think of these articles from 20 and 40 years ago? (2006 and 1986) & yes, they used to come in paper format too! #NLProc
While we're preparing the next paper or next research question to explore, perhaps reading some of the older articles can give us a wider perspective.
Should NLP metrics for bilingual code-switching use words as the token level or Intonation Units? Authors of this article show that intonation units will enhance comparisons between bilingual individuals, settings, and communities: doi.org/10.1162/COLI... #NLProc #NLP @pennlinguistics.bsky.social