Open-source interpretability to seize the means of prediction. Postdoc @ Northeastern, @ndif-team.bsky.social w/ @davidbau.bsky.social.
gsarti.com
Gabriele Sarti
Loading...
New blog: I am worried by NLP research culture
NLG and NLP are mostly much better in 2026 than when I got my PhD in 1990. Unfortunately research culture has gotten *worse” in this period, which really worries me as I retire.
ehudreiter.com/2026/06/08/n...
We are starting a new, nonprofit alignment organization, ⊢ Sequent Research, bringing together researchers previously on UK AISI’s Alignment Team, Timaeus, and elsewhere to research how to align superintelligence. We are hiring! 🧵
sequent.org/launch
The New England Mechanistic Interpretability (NEMI) workshop is coming to BU on Aug. 14!
Join us for talks, a panel, food, and plenty of opportunities to connect with the many great researchers in the area.
Register and help spread the word!
Check out our new hackathon for building the best AI lie detector on the market! A lot of interesting questions and great prizes for participants, apply early if you want in! :)
Check out our latest work led by @veraneplenbroek.bsky.social! Conversation topics seem much better explanations for LLMs behavior over latent sociodemographic info about the user - but what does the choice of topic tell us about the speaker in the first place?
Despite the huge inflow of researchers, much of the work in interpretability remains anecdotal. Our new repro challenge at BlackboxNLP (co-located with EMNLP 2026) aims to attract work challenging common assumptions and showing failure/success cases of popular methods. Negative results welcome!
As a computer scientist, we INVENTED the phrase "artificial intelligence." It was never exclusively yours.
In both fact and fiction, AI has always been awesome, and beautiful, and, yes, problematic — and it still is.
"You're right to call me on that!"
Can you catch an AI in the act of lying?
Register below to enter our AI lie-detection contest.
AI lies are a big problem. The frontier labs have all worked hard to fight AI deception. They all try to monitor their AIs for it.
With the large influx of submissions and a faster pace of research, reproducibility is more important than ever.
With this reproducibility challenge, we want to put the focus on best practices wrt. baselines🧱, ablations🌈, eval🔎 and generalizability🗺️ of interpretability!
In most ways NLG and NLP are much better in 2026 than when I got my PhD in 1990. Unfortunately research culture has gotten *worse” in this period, which really worries me as I retire. We have…
How do brains plan actions towards goals?
To get at this question we studied mice navigating complex mazes as goals changed on every trial 🧵
Work with @thomasakam.bsky.social @behrenstimb.bsky.social @kristorpjensen.bsky.social now on BioRxiv: www.biorxiv.org/content/10.6...