New paper in Nature. The more a government controls its domestic media, the more it dominates AI training data, the more pro-regime outputs we get from AI. By scraping the open web, LLMs are unwittingly laundering state-coordinated narratives into seemingly objective answers.
Eddie Yang
Eddie Yang
New in Nature: LLMs give "the party line" in the languages of authoritarian regimes. This works when they control the media, which feeds pretraining data. We show more state control over the media means less critical LLMs. 6 studies spanning 38 languages & 13 models. Details ↓
New paper out at AJPS: "The limits of AI for authoritarian control." The more repression there is, the less information exists in AI's training data, and the worse the AI performs.
Sol Messing
I wrote about my experience tinkering with AI for social science research: open.substack.com/pub/ey211/p/...
Eddie Yang
Eddie Yang
I spent two weekends figuring out how much I can integrate AI in my own research.
1/ New @Nature! We study how powerful institutions shape the information environment for LLMs. Commercial LLM training is opaque, so we trace a path from state-coordinated media -> training data -> model responses.