//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
PQN Blog 1/3: TD methods are the bread and butter of RL, yet can have convergence issues when used in practice. This has always annoyed me. Find out below why TD is so unstable and how can we understand this instability better using the TD Jacobian. @flair-ox.bsky.social @jfoerst.bsky.social
Mar 19, 2025
blog.foersterlab.com
Fixing TD Pt I: Why is Temporal Difference Learning so Unstable?
Mattie Fellows