Teaching a course this fall at @isawnyu.bsky.social on text analysis for historical languages—course description here: diyclassics.github.io/isaw-f2026-g.... If you are at NYU (or a consortium program) and are interested, let me know.
#TeachAncient #DigiClass
From Whitaker's release notes... "Permission is hereby freely given for any and all use of program and data." Amazingly open license, allowing us to build cool stuff for Latin. And thanks also to Martin Keegan for hosting the maintenance of Words since 2015, cf. mk270.github.io/whitakers-wo...
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
Made some LatinCy-based suggestions here—what else are people working with?
Also—hoping to implement genuine word sense disambiguation soon by combining this functionality with the LatinCy token vectors... getting there...
We can also "reinflect" Latin forms based on existing contextual morph annotations, e.g.
Not only can we start using LatinCy annotations to disambiguate words, we can also use the WW word formation logic to generate paradigms for spaCy tokens...
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
New blog post on the LatinCy v3.9 XPOS tags now up here:
exploratoryphilology.org/posts/latinc...
#digiclass #nlproc
Demo notebook for LatinCy Lexicon is here... github.com/latincy/lati...
Graduate course at ISAW (Fall 2026) introducing computational methods for historical-language research: Python-based text analysis & NLP via word embeddings, transformer models, and large language mod...
diyclassics.github.io
Whitaker's Words lexical data as LatinCy pipeline components for Latin NLP - latincy/latincy-lexicon
With v3.9, the LatinCy pipelines introduce a new set of human-readable, Latin-specific XPOS tags designed to surface useful linguistic information not easily derived from other UD annotations.
Question for the #digitalclassics or #digital #neolatin crowd (and @diyclassics in particular, I guess): What would be the best semantic embedding models for (Neo-)Latin currently available (not word embeddings but embeddings of text chunks)?
Recommended […]
[Original post on openbiblio.social]
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
✨ LatinCy v3.9 sm/md/lg/trf pipelines for SpaCy available ✨
- Improved tokenization and u/v norm
- New custom Latin-specific XPOS tags
- Better, more consistent lemma/morph coverage
huggingface.co/latincy/la_c...
#digiclass #nlproc