Inlay

PhDing @BrownCS | Algorithm auditing & accountability | Understanding Algorithmic Systems | victorojewale.github.io/ | https://victorojewale.substack.com/

Technologies like synthetic data, evaluations, and red-teaming are often framed as enhancing AI privacy and safety. But what if their effects lie elsewhere? In a new paper with @realbrianjudge.bsky.social at #EAAMO25, we pull back the curtain on AI safety's toolkit. (1/n) arxiv.org/pdf/2509.22872

How do we stop playing whack-a-mole when it comes to deepfake abuse? 🧵⚠️

Agents prioritize task completion rather than whether they should act. This is a consequence of how they are trained. My student @victorojewale.bsky.social has been investigating this and just wrote a (prize winning) paper arguing why (and how) we need a notion of "informed abstention". Link below.