Even strong embedding models over-index on surface features—for every model tested, similarity scores are more reflective of author or fandom than semantic aspects like theme or characterization. This is true even if models are explicitly instructed to focus on these aspects!