Inlay

//

Post

This work was devised at the ECIR collab-a-thon last year, and we hope to continue discussions at this year's collab-a-thon in Lucca! Read more here: arxiv.org/abs/2502.20937 #ECIR2025 #SIGIR2025

Mar 3, 2025

The fundamental property of Cranfield-style evaluations, that system rankings are stable even when assessors disagree on individual relevance decisions, was validated on traditional test collections. ...

arxiv.org

Variations in Relevance Judgments and the Shelf Life of Test Collections