Our approach: we take a meta-RL algorithm recently proposed as a model of PFC, and train it to select not just physical actions, but also mental actions—internal computations that leave the environment unchanged but generate decision-relevant information.