a natural language processor and “sensible linguist”. PhD-ing LTI@CMU, previously BS-ing Ling+ECE@UTAustin
🤠🤖📖 she/her
lindiatjuatja.github.io
Lindia Tjuatja
Really excited about this work w/ my long-time collaborators at Boulder!
We address limitations in existing morphosyntactic annotation systems for digitally under-resourced languages and show how *jointly* predicting morphological segmentation helps with glossing performance
We have to talk about rigor in AI work and what it should entail. The reality is that impoverished notions of rigor do not only lead to some one-off undesirable outcomes but can have a deeply formative impact on the scientific integrity and quality of both AI research and practice 1/
I've found it kind of a pain to work with resources like VerbNet, FrameNet, PropBank (frame files), and WordNet using existing tools. Maybe you have too. Here's a little package that handles data management, loading, and cross-referencing via either a CLI or a python API.
Lindia Tjuatja
Good news (for me!) my gender bias paper from 2023 still replicates with GPT-5.
Bad news (for everyone!) my gender bias paper from 2023 still replicates with GPT-5.
arxiv.org/pdf/2308.14921
hkotek.com/blog/gender-...
🇦🇹I'll be at #ACL2025! Recently I've been thinking about:
✨linguistically + cognitively-motivated evals (as always!)
✨understanding multilingualism + representation learning (new!)
I'll also be presenting a poster for BehaviorBox on Wed @ Poster Session 4 (Hall 4/5, 10-11:30)!
Whoops my b, 11-12:30
Where does one language model outperform the other?
We examine this from first principles, performing unsupervised discovery of "abilities" that one model has and the other does not.
Results show interesting differences between model classes, sizes and pre-/post-training.
Alexandra Olteanu
I did an interview w/ Pittsburgh's NPR station to share some of my views on the topic of the McCormick/Trump AI & Energy summit at CMU tomorrow. Despite being hosted at the university, there will not be opportunities for our university experts to contribute viewpoints at the event.
The viral "Definition of AGI" paper tells you to read fake references which do not exist!
Proof: different articles present at the specified journal/volume/page number, and their titles exist nowhere on any searchable repository.
Take this as a warning to not use LMs to generate your references!