Nature - Government-controlled media influences the output of large language models via their training data, and models queried in the languages of countries with lower media freedom show a...
1/ New @Nature! We study how powerful institutions shape the information environment for LLMs. Commercial LLM training is opaque, so we trace a path from state-coordinated media -> training data -> model responses.
Thank you @meharpist.bsky.social. It was a pleasure working with you on this paper.
New in Nature: LLMs give "the party line" in the languages of authoritarian regimes. This works when they control the media, which feeds pretraining data. We show more state control over the media means less critical LLMs. 6 studies spanning 38 languages & 13 models. Details ↓
Hannah Waight
Brandon Stewart
I’m excited to share a new paper in Nature that shows how large language models launder the strategic rhetoric of authoritarian states. Paper here: www.nature.com/articles/s41.... A thread.
Thank you so much @jenjennings.bsky.social !
Wonderful to work on this with you @jatucker.bsky.social !
Sol Messing
Government-controlled media influences the output of large language models via their training data, and models queried in the languages of countries with lower media freedom show a stronger ...
Millions of people(including me) turn to LLMs for information and advice. But who shapes their output? A @nature.com paper shows the influence of state controlled media on LLM output. Read all the details in the article here: www.nature.com/articles/s41...
“Here we show through six studies that government control of the media across the world already influences the output of LLMs via their training data.”
This is, hands down, the best paper I’ve read about AI. Kudos to this rockstar team!
Government-controlled media influences the output of large language models via their training data, and models queried in the languages of countries with lower media freedom show a stronger ...
Government-controlled media influences the output of large language models via their training data, and models queried in the languages of countries with lower media freedom show a stronger ...
1/ Excited to report we have a new paper out
@nature.com today! The bottom line: training data for LLMs does not just fall from the sky - it is created in the context of existing social political institutions - and that has consequences for LLM output.
nature.com/articles/s41...
Government-controlled media influences the output of large language models via their training data, and models queried in the languages of countries with lower media freedom show a stronger ...