Extremely good and important work. Thank you @bttyeo.bsky.social
Felipe De Brigard
Cross-validation is widely used to compare predictive performance — not just to benchmark algorithms, but for scientific applications like ranking biomarkers.
But performance across folds isn't independent, so standard tests (e.g. paired t-test) inflate false positives. 2/N