One of my favorite equations, after assumptions, details how the system dynamics (A) and control cost (Q) interact with the closed-loop dynamics (F).
This reveals a continuum of environment-objective pairs consistent with behavior. Inverse RL / IOC typically lies at one end of this continuum.
Inferring both the system dynamics *and* the control objective from partial observations is inherently ill-posed. Characterizing the exact (non-)identifiability and identifying how to perform inference was the challenge!
Assuming known environments or costs is reasonable in engineered systems, but maybe less so for intelligent agents in complex worlds.
A year later, I see this as clarifying how unobserved objectives and dynamics interact to produce a continuum of explanations, and which perturbations are needed.
This was a theory project spearheading a longer program on neural substrates of cognitive control, with the amazing Juncal Arbelaiz and Harrison Ritz (@hritz.bsky.social), and with great guidance from Nathaniel Daw (@nathanieldaw.bsky.social), Jon Cohen and Jonathan Pillow (@jpillowtime.bsky.social)
We show that the joint problem boils down to two steps:
1. Infer closed-loop parameters (which can be done efficiently with SSM methods ✅)
2. Derive equations relating the parameters of interest in setting the closed-loop dynamics.
See our paper (also on arXiv, link above) for details!
In a system subject to unobserved control, can you infer both the underlying dynamics and the control objective? 🤔
A year ago, I was presenting our work at IEEE CDC on solving this problem for stochastic LQR.
arxiv.org/abs/2502.15014
Short 🧵 on the results, and how I think about them a year later.
Congrats to PNI + affiliated trainees named 2025 Honorific Fellows by the @princeton.edu Graduate School!
👏 Victor Geadah
👏 Isaac Christian
👏 @danmirea.bsky.social
gradschool.princeton.edu/news/2025/ho...