Check out our new paper, investigating phenomena (hallucination, refusal, and sycophancy) both externally and internally! Showing a high correlation between the two!
Adi Simhi
How does an LLM’s past influence its future?🤔
In new work, led by @adisimhi.bsky.social, together with @fbarez.bsky.social @boknilev.bsky.social and Shay Cohen, we find conversational history creates a latent "geometric trap" which makes old habits e.g. hallucinations hard to break!