Inlay

We propose a protocol that combines evaluation datasets with persona dialogue prefixes to measure the effect of conversation length on model behavior. We then use it to measure the impact of length on: 🎭 Persona Fidelity ✅ Instruction Following 🔒 Safety