PhDing at LTI, CMU
Prev: Ai2, Google Research, MSR
Evaluating language technologies, regularly ranting, and probably procrastinating.
https://sites.google.com/view/shailybhatt/
Shaily
Loading...
Loved this wonderful essay, which talks about discernment of LLM use but also how we are doing too much.
I just realized that the acronym/initials of one of my fav musicians and conference deadline is the same. I do not know how I never saw this, despite listening to the same music on loop in those days before the deadline all my (research) life. Its blowing my mind and shout it into this void 💀
It finally happened, someone told me that a direction I suggested made sense because "gemini says its novel and no one is focusing on it".
This is a real banger of a paper. The example of a model being weirdly focused on jasmine (lol) makes me increasingly think that single-point-of-access models don't really consider who their audience is. Jasmine is a super legible cultural marker for people outside, but is so, _so_ generic.
Shaily
Shaily
Models are now expert math solvers, and so AI for math education is receiving increasing attention.
Our new preprint evaluates 11 VLMs on our QA benchmark, DrawEduMath. We highlight a startling gap: models perform less well on inputs from K-12 students who need more help. 🧵
The Workshop on Developing Standards and Documentation For LLM Use in HCI Human Subjects Research aims to bring the HCI community together to develop standards, guidance, and documentation for the use of large language models (LLMs) as simulated research authors. 1/2
Shaily
Sireesh Gururaja
Deadline for submission is in just under 10 days! Reach out if you have any questions.
Willie Agnew
Lucy Li
Tal August
What is future of reading? 📗
Announcing the 1st Science & Technology of Augmented Reading (STAR) workshop at #CHI2026!
We want your takes on: 🤖 AI & Agents for reading 👁️ Visual Interactions 🗺️ Domains (Code, Law, Ed, etc.)
👇 Submit a 2-4 pg paper: chi-star-workshop.github.io