Inlay

Profile

Associate Prof at American University SPA (formerly UW-Madison), NBER & IZA studying environmental, health & ed policy. PhD from Northwestern U. Thoughts are my own.

Claudia Persico

Two things are going wrong. First: each AI has a different calibration. Second: a feedback loop. Tasks where AI is advancing fastest generate the most training data, so newer models rate those tasks as more exposed. Paper (free): nber.org/papers/w35110

1mo

Welcome our next seminar speaker, Claudia Persico (@claudiapersico.bsky.social). She'll be presenting, "The Effects of Daily Air Pollution on Students, Teachers and School Violence."

1mo

Claudia Persico

Miami University Economics

Is anyone looking for a last minute extra paper on pollution, education, or health (or crime) for a panel at #ASSA2027?

My new paper on Air Quality and Suicide is online at the Journal of Health Economics! www.sciencedirect.com/science/arti...

Then we asked: does this matter for the conclusions economists are actually drawing? We plugged each model’s scores into a standard labor economics analysis. With one model’s scores: significant job losses. With another’s: no detectable effect. The entire finding flipped. 3/

2mo

These scores are not academic exercises. The ILO uses them. The IMF uses them. The BLS uses them. Acemoglu (2025), Brynjolfsson et al. (2025), and Eisfeldt et al. (2023) are built on them. Nobody was checking whether a different model would give a different answer. 4/

The Wall Street Journal covered my new paper on how we estimate the effects of AI (incorrectly) with @michelleyin.bsky.social and @hoavu.bsky.social! www.wsj.com/tech/ai/ai-m...

1mo

We asked four LLMs how exposed your job is to AI. They could NOT agree. Management: 15% vs 90% Legal: 10% vs 75% Healthcare: 5% vs 60% Same rubric. Same jobs. Same data. Different AI, completely different answer. New @nber.org working paper with @michelle-yin.bsky.social & @hoavu.bsky.social ! 1/

We replicated the most widely used AI exposure rubric (Eloundou et al. 2024) with four frontier models: GPT-4, ChatGPT-5, Gemini 2.5, and Claude 4.5. Same instructions. Same O*NET task data. Same pipeline. Mean exposure ranged from 14% to 51%. A 3.6x gap on identical jobs. 2/

1mo

Here is every occupation, all 95, sorted by how much the four models disagree. Top of the chart: 87 percentage points of disagreement for a single occupation. One AI sees the job as almost fully exposed. Another sees it as barely exposed. Find your job. 5/

Claudia Persico

1mo

Claudia Persico

Founded in 1920, the NBER is a private, non-profit, non-partisan organization dedicated to conducting economic research and to disseminating research findings among academics, public policy makers, an...

nber.org

How (un)Stable Are LLM Occupational Exposure Scores? Evidence from Multi-Model Replication

Claudia Persico