🚨Our paper is out in PNAS: we found classic human persuasion techniques worked on AIs in a "parahuman" way, making them agree to objectionable requests (increasing compliance from 35% to 51%)
It worked on a range of major recent LLMs though newer models do resist more www.pnas.org/doi/10.1073/...