"CLDF Meta" (meta.clld.org) links to data from over 800 CLDF datasets that have been released on Zenodo (e.g. WALS, APiCS and Grambank). There are links to data on over 9000 languoids, e.g. 98 entries on Ambulas (to take a random language). Great work by my colleague Johannes Englisch!
With big thanks to Alexey Koshevoy, @alexeykoshevoy.bsky.social Marie Hallo @marie-hl.bsky.social and Rowan Hall Maudslay!
Scientists are asking if it might be unconscious
New post: Tagging my blog with BERTopic and LLMs.
I suspect as token costs increase, we'll see more blended traditional ML/LLM systems. And, in general, they work really well!
vickiboykis.com/2026/05/18/t...
vickiboykis.com
LLMs mean you still need a human in the loop, but in a different part
🚨New paper alert!🚨 In our new paper with @oliviermorin.bsky.social, Rowan Hall, and Marie Halo published in the Journal of Language Evolution, we explored the relationship between word length and ambiguity on a large-cross linguistic dataset. 1/6