Towards Parameter-Free Temporal Difference Learning
Yunxiang Li, Mark Schmidt, Reza Babanezhad, Sharan Vaswani
š arxiv.org/abs/2603.02577
Temporal difference (TD) learning is a fundamental algorithm for estimating value functions in reinforcement learning. Recent finite-time analyses of TD with linear function approximation quantify its...