We hope this work provides a good introduction to the field.
Finding temporal structure is challenging. As such, we carefully laid down some of the most pressing questions in the field.
We also identified domains that are particularly promising, e.g. open-ended systems.
Martin Klissarov
Our team in London is hiring a research scientist! If you want to come work with a wonderful group of researchers on investigating the frontiers of autonomous open-ended agents that help humans be better at doing things we love, come have a look. Link in post below 👇
Our paper showing that LMs benefit from human-like abstractions for code synthesis was accepted to ICLR! 🇸🇬
We show that order matters in code gen. -- casting code synthesis as a sequential edit problem by preprocessing examples in SFT data improves LM test-time scaling laws
We often get bogged down by differences in formalisms (goal-direction RL, options, feudal RL, skills …) -- we unite these core ideas through a single perspective.
We believe hierarchical RL is fundamentally about the algorithm through which we discover temporal structure.
We are looking to continue to improve this manuscript, please share your feedback!
www.arxiv.org/abs/2506.14045
This work was done over the course of many friendly virtual calls Akhil Bagaria and @ray-luo.bsky.social , and under the thoughtful guidance of researchers that have spent decades working on these problems, namely George Konidaris, Doina Precup and
@marloscmachado.bsky.social