[7/8] Beyond Grid World, ESWM is scalable to the more complex MiniGrid (high-dimensional observation) and 3D indoor scenes ProcThor (realistic pixel observations).
[3/8] ESWM is designed to operate on sets of temporally independent transitions. Given such a set, it infers unseen transitions. The model is meta-trained across environments to support generalization. We show three settings in which we validate ESWM.
New paper 🚨 #ICLR26
Most world models predict the future from a past trajectory. But neuroscience suggests that such inference can instead be made from temporally independent experiences.
We built the Episodic Spatial World Model (ESWM), a model that does exactly this:
Video abstract [1/2]
[6/8] When environments change (e.g., new obstacles), ESWM adapts by updating its temporally and spatially independent memories. No retraining is needed.
[5/8] ESWM also supports efficient exploration by acting on uncertainty to collect experiences and navigate between states.
[1/8] Existing world models rely on a sequence of observations to predict future states. This leads to: 1) redundancy due to temporal overlap (contexts grow for large envs), 2) limited adaptability when environments change due to temporal dependency.