It turns out that the eigenvalues of the pairwise similarity matrix is the key to measure diversity. Both Determinantal Point Processes and Vendi Score (arxiv.org/abs/2210.02410) formalize this intuition and allow differentiable optimization. It is also well correlated with human judgement!
arxiv.org
Diversity is an important criterion for many areas of machine learning (ML), including generative modeling and dataset curation. However, existing metrics for measuring diversity are often domain-spec...