Inlay

This was a very fun project. In Behavior-Consistent Deep RL, we provide a method that aligns the behavior of independently trained policies. It turns out, this works even in high dimensional spaces. Here are 6 seeds of Humanoids (all ca same return). (left) Baseline (right) Ours.

16d

Marcel Hussing

🚨 New Preprint Alert: Behavior-Consistent Deep Reinforcement Learning 🚨 TLDR: We introduce an approach that achieves behavioral similarity across independent algorithm executions in continuous state-action space deep RL.

20d

Marcel Hussing