Thanks WiAIR (@wiair.bsky.social) for featuring my work on your YouTube channel. Watch the video to hear about our work on inference-time steering — and why these interventions LLMs may not be as “precise” as they look.
arxiv.org/abs/2403.01015 comes to mind. Generally, Nihar's lab has a lot of amazing work in this space that should be relevant to your search
This call is still open. I am looking to recruit, as well as many other faculty at Cornell. We review folders as they come, and will send offers until all positions are filled.
Please share with your network 🙏
while steering methods effectively control target behavior, they substantially increase LLMs’ vulnerability to jailbreaks, revealing a failure of robust specificity. If you’re at EACL, stop by my poster at 9AM today to hear more.
Here's a link to the full paper: aclanthology.org/2026.eacl-lo...
In this work, we argue that evaluating efficacy alone isn’t enough. Steering has two sides — efficacy and specificity — yet current evaluations predominantly focus on the former. We introduce a three-part framework for specificity (general, control, robustness) and show that...
Woah, this is so cool! How was I not aware of this. I just set mine up to prepare for NeurIPS and I am loving it already... it made thousands of accepted paper so much more tractable to navigate
I have heard ~5 onsite per open position and ~10 phone interviews per onsite invite