Sometimes this yields the right answer for the wrong reasoning (“Portuguese” from “Brazil”), other times, it produces confident errors (“Japanese” from “Honda”). 4/n
These examples show answers — even to the same query — can shift under different irrelevant contexts. Can we predict these shifts? 2/n