//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
We discuss the following intrinsic rewards. Intuitively, we encourage gain in prediction accuracy or reduction in entropy. For those which are Potential-based Reward Shaping, the optimality is not affected, and we hypothesize that they potentially accelerate the training. (4/9)
11mo