Inlay

Profile

🛍️Major AI companies are increasingly embedding sponsored content into chatbot conversations. Across two preregistered experiments (N=2,012), we test how effectively AI can steer consumers toward sponsored products in a realistic shopping scenario. 📝https://arxiv.org/abs/2604.04263

2mo

Good work from @hayoungjung.bsky.social and @manoelhortaribeiro.bsky.social Scientific AI agents are actively being deployed to synthesize clinical conclusions, but their factual accuracy remains remarkably low. #MedSky 🔗 Direct link: arxiv.org/pdf/2606.11337

📝Excited to share our new preprint, “AI Assistance for Discretionary Work: Increasing Feedback Provision in Higher Education”: arxiv.org/abs/2606.03095 A thread 🧵 1/8

10d

Deepfake pornography isn’t going away just because we are passing laws and taking down a couple of big websites. Our new pre-print, led by @aedcv.bsky.social suggests that the sharing of this material continued to prosper even after platform and policy shocks. arxiv.org/abs/2602.02754

Francesco Salvi

arxiv.org

Scott McGrath

4mo

First paper of my PhD with my amazing advisors! There’s been a ton of hype and media coverage on OpenEvidence as an “AI co-pilot for clinicians”… and our long-horizon benchmark puts them to the test!! Our results suggest they are far from reliable for downstream use.

Romina Mahinpei

One thing we also didn’t expect while building this benchmark: AI agents kept “cheating” Even when told not to, they searched the web for ground-truth answers. So we built a clean-room harness to filter answer-leaking results. We’re now exploring this more deeply in follow-up work👀

Broadly interested in computational social science, AI safety & evaluation, NLP for social good & applications (in public health, science...)! Happy to chat or grab coffee at the conference! Feel free to DM me :)

I am at #EMNLP2025🇨🇳 to present our main paper *MythTriage: Scalable Detection of Opioid Use Disorder Myths on a Video-Sharing Platform*! Come by to discuss details! 🏦 Location: Hall C ⏲️Time: 11AM-12:30PM 🔗 Paper: aclanthology.org/2025.emnlp-m... 📁 Repo: github.com/hayoungjungg...

7mo

New preprint! We introduce a new benchmark, SciConBench, with 9.11k scientific questions derived from Cochrane Systematic Reviews. We find evidence that frontier AI agents **cannot** synthesize scientific conclusions well. A thread 🧵 w/ @hayoungjung.bsky.social & others!

Whoa, excellent study just dropped in Science! "Reranking partisan animosity in algorithmic social media feeds alters affective polarization" www.science.org/doi/10.1126/... Led by @tiziano.bsky.social and @msaveski.bsky.social

7mo

Manoel Horta Ribeiro

6mo