📢 My team at Meta (including Yaron Lipman and Ricky Chen) is hiring a postdoctoral researcher to help us build the next generation of flow, transport, and diffusion models! Please apply here and message me:
www.metacareers.com/jobs/1459691...
Looking for a principled evaluation method for ranking of *general* agents or models, i.e. that get evaluated across a myriad of different tasks?
I’m delighted to tell you about our new paper, Soft Condorcet Optimization (SCO) for Ranking of General Agents, to be presented at AAMAS 2025! 🧵 1/N
We've built a simulated driving agent that we trained on 1.6 billion km of driving with no human data.
It is SOTA on every planning benchmark we tried.
In self-play, it goes 20 years between collisions.
Brandon is a wonderful research colleague and I could not endorse enough trying to work with Brandon
Brandon Amos
🤔 How to extract knowledge from LLMs to train better RL agents?
📚 Our new paper (w. Q. Zheng, @mikaelhenaff.bsky.social, A. Zhang, A. Grover) studies LLM-driven feedback for NetHack!
Paper: arxiv.org/abs/2410.23022
Code: github.com/facebookrese...
Highly recommended! I worked with Brandon, Yaron, and Ricky. Very smart, fun, and friendly crew thinking deeply about probabilistic machine learning.