Also—hoping to implement genuine word sense disambiguation soon by combining this functionality with the LatinCy token vectors... getting there...
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
Demo notebook for LatinCy Lexicon is here... github.com/latincy/lati...
New blog post on the LatinCy v3.9 XPOS tags now up here:
exploratoryphilology.org/posts/latinc...
#digiclass #nlproc
Whitaker's Words lexical data as LatinCy pipeline components for Latin NLP - latincy/latincy-lexicon
With v3.9, the LatinCy pipelines introduce a new set of human-readable, Latin-specific XPOS tags designed to surface useful linguistic information not easily derived from other UD annotations.
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc
✨ LatinCy v3.9 sm/md/lg/trf pipelines for SpaCy available ✨
- Improved tokenization and u/v norm
- New custom Latin-specific XPOS tags
- Better, more consistent lemma/morph coverage
huggingface.co/latincy/la_c...
#digiclass #nlproc
Patrick J. Burns
Patrick J. Burns
We can also "reinflect" Latin forms based on existing contextual morph annotations, e.g.
Made some LatinCy-based suggestions here—what else are people working with?
Not only can we start using LatinCy annotations to disambiguate words, we can also use the WW word formation logic to generate paradigms for spaCy tokens...
From Whitaker's release notes... "Permission is hereby freely given for any and all use of program and data." Amazingly open license, allowing us to build cool stuff for Latin. And thanks also to Martin Keegan for hosting the maintenance of Words since 2015, cf. mk270.github.io/whitakers-wo...
Teaching a course this fall at @isawnyu.bsky.social on text analysis for historical languages—course description here: diyclassics.github.io/isaw-f2026-g.... If you are at NYU (or a consortium program) and are interested, let me know.
#TeachAncient #DigiClass
Graduate course at ISAW (Fall 2026) introducing computational methods for historical-language research: Python-based text analysis & NLP via word embeddings, transformer models, and large language mod...
diyclassics.github.io
Question for the #digitalclassics or #digital #neolatin crowd (and @diyclassics in particular, I guess): What would be the best semantic embedding models for (Neo-)Latin currently available (not word embeddings but embeddings of text chunks)?
Recommended […]
[Original post on openbiblio.social]
Announcing—LatinCy Lexicon v0.1, a refactored version of Whitaker's Words that uses LatinCy annotations to disambiguate words/meanings. Can be added as a custom component to any LatinCy pipeline. github.com/latincy/lati... #digiclass #nlproc