My favorite approach is based on internal RL. What the hell is that!? Our brain not only chooses (external) actions to maximize its return, it also optimizes the 'mental' processes (= internal actions): what to relay to working memory, replay, attend, and perceive! See www.cell.com/neuron/fullt...