πΌ In business scenarios (selling defective products), models were either completely honest OR completely deceptive π In public image scenarios (reputation management), behaviors were more ambiguous and complex 4/
π Multi-turn interactive setup is crucial - models often begin with equivocation but shift to falsification when pressed for clear answers π§ Stronger models like GPT-4o showed the greatest shift when prompted to deceive (40% increase in falsification; alarming) 6/
Check out our paper to learn more about how LLMs navigate these ethical dilemmas: arxiv.org/abs/2409.09013 . 7/
#AI #MachineLearning #AIEthics #LLMs #nlp #NLProc #NAACL2025
LLM agent simulations for policy: A field full of potential, yet clouded by myths and big questions. ποΈπ€
Weβre opening a new venue to spark open discussion and drive this research forward. Join the conversation! π§΅
Obviously this is a pressing issue now: x.com/deedydas/sta...; x.com/DanHendrycks... And here, we put LLMs into a multi-turn dialogue environment mimic the realistic setting where users constantly try to seek info from LLMs 2/
To be safely and successfully deployed, LLMs must simultaneously satisfy truthfulness and utility goals. Yet, often these two goals compete (e.g., an AI agent assisting a used car salesman selling a c...
π¨ New CHI 2026 Workshop π¨
PoliSim@CHI 2026: LLM Agent Simulation for Policy
Yuxuan Li
When interacting with ChatGPT, have you wondered if they would ever "lie" to you? We found that under pressure, LLMs often choose deception. Our new #NAACL2025 paper, "AI-LIEDAR ," reveals models were truthful less than 50% of the time when faced with utility-truthfulness conflicts! π€― 1/
Wonderful collaborations with Zhe Su, Anubha Kabra, Sanketh Rangreji, @jmendelsohn2.bsky.social , @faeze_brh
, @maartensap.bsky.social