Alignment with demographic subgroups can look good for single survey questions, yet miss the correlation structure of cultural values.
Tristan Williams, Sebastian Padó, Alan Akbik and I propose a 2-level eval framework and apply it to demographically aligned LLMs.
arxiv.org/abs/2601.15755