Inlay

We discuss the following intrinsic rewards. Intuitively, we encourage gain in prediction accuracy or reduction in entropy. For those which are Potential-based Reward Shaping, the optimality is not affected, and we hypothesize that they potentially accelerate the training. (4/9)