HotelCheckSpan: A Benchmark Dataset for LLM Faithfulness
huggingface.co/datasets/pat... by @patuchen.bsky.social, @tuetschek.bsky.social and @saad.me.uk
🏨 Hotel summary faithfulness benchmark with span-level error labels. lrec.elra.info/lrec2026-mai...