Bradley-Terry Rankings for Recommender Systems Across Dataset Taxonomies

2026-06-05 • Information Retrieval

Information RetrievalMachine Learning

AI summaryⓘ

The authors address the problem of fairly ranking recommendation algorithms, which is tricky because algorithm performance changes with different dataset features like sparsity and size. They propose a new way to rank these algorithms using the Bradley-Terry model, which accounts for dataset differences. They also introduce a method to check how consistent these rankings are and show that their approach works even with incomplete data. Lastly, they develop a way to predict algorithm rankings on new datasets without running the algorithms, using enhanced Bradley-Terry models.

recommendation algorithmsranking methodologyBradley-Terry modeldataset characteristicsNDCGperformance metricsranking consistencycovariatesalgorithm evaluationincomplete data

Authors

Ekaterina Grishina, Stepan Kuznetsov, Askar Tsyganov, Ilya Ivanov, Daria Korovaitceva, Margarita Rusanova, Uliana Parkina, Alexander Derevyagin, Evgeny Frolov, Sergey Samsonov, Anton Lysenko

Abstract

The ranking of recommendation algorithms is a challenging problem since model performance is sensitive to dataset characteristics such as sparsity, sequential structure, and scale. This drives a demand for a proper methodology for fair comparison between algorithms. Naive aggregation of performance metrics (e.g., averaging NDCG over benchmarks) can yield misleading rankings, undermining practical selection. To address this problem, we introduce a novel, data-driven ranking methodology based on Bradley-Terry (BT) model. We demonstrate that the obtained ranking depends on key dataset statistics. Additionally, we propose a novel metric for evaluating ranking consistency and demonstrate robustness of our ranking to incomplete data. Finally, we introduce a dataset-specific methodology for ranking algorithms on unseen datasets without running the models, relying on extensions of the Bradley-Terry framework, including BT trees and BT models with covariates.

View PDFOpen arXiv