Do they actually induce? There's correlation: models that can state the rule tend to solve the task, and success climbs only as the past trajectories become enough to pin the rule down. But it's uneven across rule types, and some wins look more like copying seen answers than genuine reasoning.