Bradley-Terry Rankings for Recommender Systems Across Dataset Taxonomies

2026-06-05Information Retrieval

Information RetrievalMachine Learning
AI summary

The authors address the problem of fairly ranking recommendation algorithms, which is tricky because algorithm performance changes with different dataset features like sparsity and size. They propose a new way to rank these algorithms using the Bradley-Terry model, which accounts for dataset differences. They also introduce a method to check how consistent these rankings are and show that their approach works even with incomplete data. Lastly, they develop a way to predict algorithm rankings on new datasets without running the algorithms, using enhanced Bradley-Terry models.

recommendation algorithmsranking methodologyBradley-Terry modeldataset characteristicsNDCGperformance metricsranking consistencycovariatesalgorithm evaluationincomplete data
Authors
Ekaterina Grishina, Stepan Kuznetsov, Askar Tsyganov, Ilya Ivanov, Daria Korovaitceva, Margarita Rusanova, Uliana Parkina, Alexander Derevyagin, Evgeny Frolov, Sergey Samsonov, Anton Lysenko
Abstract
The ranking of recommendation algorithms is a challenging problem since model performance is sensitive to dataset characteristics such as sparsity, sequential structure, and scale. This drives a demand for a proper methodology for fair comparison between algorithms. Naive aggregation of performance metrics (e.g., averaging NDCG over benchmarks) can yield misleading rankings, undermining practical selection. To address this problem, we introduce a novel, data-driven ranking methodology based on Bradley-Terry (BT) model. We demonstrate that the obtained ranking depends on key dataset statistics. Additionally, we propose a novel metric for evaluating ranking consistency and demonstrate robustness of our ranking to incomplete data. Finally, we introduce a dataset-specific methodology for ranking algorithms on unseen datasets without running the models, relying on extensions of the Bradley-Terry framework, including BT trees and BT models with covariates.