Grima et al. introduce a novel foraging paradigm to understand how mice can rapidly learn to exploit multi-option environments with varied reward statistics. Decision-making is explained by a model integrating optimal foraging, reinforcement learning, and machine learning. Mesolimbic dopamine best reflects the dynamic global learning rate of the model.