//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
PQN, a recently introduced value-based method (bsky.app/profile/matt...) has a similar data-collection as PPO. Although we see a similar trend as with PPO, but much less pronounced. It is possible our findings are more correlated with policy-based methods. 9/
Jun 5, 2025
Pablo Samuel Castro
Super excited to share our paper, Simplifying Deep Temporal Difference Learning has been accepted as a spotlight at ICLR! My fab collaborator Matteo Gallici and I have written a three part blog on the work, so stay tuned for that! :) @flair-ox.bsky.social arxiv.org/pdf/2407.04811
Mar 18, 2025
arxiv.org
Mattie Fellows