š§µ New preprint! Towards Understanding Steering Strength w/ M. Taimeskhanov and D. Garreau
Activation steering is a popular way to control LLM behavior at inference. But how much should you steer? We provide the first theoretical analysis of the steering strength α.
š arxiv.org/abs/2602.02712
1/7