Inlay

Profile

Computational linguistics • Natural language processing • Formal linguistics • Machine translation | at Faculty of Mathematics and Physics, Charles University

Institute of Formal and Applied Linguistics

Announcing First Call for Papers: Second Tokenization Workshop 🔡 📣 ▶️ Non-archival submissions of two types: Research papers (up to 9 pages) ▶️ Extended abstracts (up to 2 pages) Submission deadline June 23, 2026 (AoE) Acceptance notification on July 24, 2026 (AoE) tokenization-workshop.github.io

Automatic Suggestions Help Extending Eventive Ontology: A Case Study on SynSemClass by @janastrakova.bsky.social, Eva Fučíková, Zdenka Urešová & @hajicjan.bsky.social lrec.elra.info/lrec2026-mai... Auto-suggestions improve semantic ontology annotation agreement in SynSemClass case study.

HotelCheckSpan: A Benchmark Dataset for LLM Faithfulness huggingface.co/datasets/pat... by @patuchen.bsky.social, @tuetschek.bsky.social and @saad.me.uk 🏨 Hotel summary faithfulness benchmark with span-level error labels. lrec.elra.info/lrec2026-mai...

DReUD: Discourse Relations in Universal Dependencies by Jiří Mírovský and Pavlína Synková lrec.elra.info/lrec2026-mai... UD-based shallow discourse relation annotation scheme + DReUD parser for Czech 🇨🇿 & English 🇬🇧

SEEM-CZ: Annotation and Classification of Epistemic Markers in Czech by Bára Štpánková, Michal Novák, Tomáš Musil, Lucie Poláková lrec.elra.info/lrec2026-mai... Czech epistemic markers dataset (~4,000 uses) with annotations and XLM-RoBERTa classifiers for NLP.

CzechDocs: A Multiway Parallel Dataset of Formatted Documents for Minority Languages in Czechia 🇨🇿 by Pepa Jon and Ondřej Bojar 📝 lrec.elra.info/lrec2026-mai... 📂 github.com/cepin19/Czec...

#LREC2026 is over. It had huge @ufal.mff.cuni.cz presence 👯🙆‍♂️🧑‍🤝‍🧑🙋🧓🧒 of 2️⃣6️⃣ people and 1️⃣4️⃣ main conference papers. If you missed our presentations, you can still checkout our #NLProc and #CL papers 👇

Great discussions, inspiring talks, and lots of interest around the 🌲Prague Dependency Treebank🌲 at #LREC2026. Hope we helped give PDT some well-deserved visibility there! @ufal.mff.cuni.cz ➡️ lrec.elra.info/lrec2026-mai... ➡️ lrec.elra.info/lrec2026-mai... ➡️ lrec.elra.info/lrec2026-mai...

22d

TokShop will be at #COLM2026! 🗓️ October 9th, 2026 📍 San Francisco, USA More details and a call for papers coming soon.

25d

24d

1mo

Tokenization Workshop (TokShop) @COLM2026

Institute of Formal and Applied Linguistics

Marie Mikulová

Tokenization Workshop (TokShop) @COLM2026