π€ AI systems increasingly generate drafts, suggestions, and templates that people can adopt, revise, or ignore. Prior work shows AI can improve required tasks, but can it help people do beneficial work they often intend to do but frequently skip?
2/8
Joint work with the amazing Victoria Dean, Ruth Fong, Lydia Liu, and Manoel Horta Ribeiro!
cc: @manoelhortaribeiro.bsky.social, @vdean.bsky.social
8/8
π§ In many domains, the challenge is not only how well work is done, but whether it gets done at all. Our findings suggest AI can help people initiate discretionary but socially beneficial work that might otherwise go undone.
7/8
π We studied this question in a semester-long randomized field experiment on personalized feedback provision in an undergraduate machine learning course, a pedagogically valuable but often optional task.
3/8
πExcited to share our new preprint, βAI Assistance for Discretionary Work: Increasing Feedback Provision in Higher Educationβ: arxiv.org/abs/2606.03095
A thread π§΅ 1/8
π Interestingly, AI did not significantly reduce the time spent per unit of feedback. Interviews revealed why: TAs rarely treated drafts as final outputs. They used them as editable scaffolds that helped them articulate feedback and get started on feedback they might otherwise not have written.
6/8
Romina Mahinpei
π€ We built an AI-assisted feedback system that generated personalized feedback drafts for TAs after grading. TAs remained fully in control and could use, edit, or ignore the drafts. We evaluated the system across 2,828 graded question submissions.
4/8
π AI assistance increased feedback provision by 10.8 percentage points and increased feedback length by 39.8 characters. Students received substantially more personalized feedback, without lower ratings of usefulness.
5/8
Romina Mahinpei
Romina Mahinpei
Romina Mahinpei
π¨New preprint!π¨
arxiv.org/abs/2606.03095
In a randomized field experiment in a 300-level machine learning course, teaching assistants received AI-assisted feedback drafts after grading student submissions.
Did AI help them provide more feedback when feedback was optional?
A thread π§΅
Romina Mahinpei
Romina Mahinpei
New preprint!
We introduce a new benchmark, SciConBench, with 9.11k scientific questions derived from Cochrane Systematic Reviews.
We find evidence that frontier AI agents **cannot** synthesize scientific conclusions well.
A thread π§΅
w/ @hayoungjung.bsky.social & others!