On the Learning Curves of Revenue Maximization

2026-04-29Machine Learning

Machine LearningData Structures and AlgorithmsComputer Science and Game Theory
AI summary

The authors study how quickly algorithms improve their revenue predictions as they see more data, focusing on selling a single item to one buyer. Unlike earlier work that looks at the worst-case scenarios, they examine actual learning curves that show error reduction over time. They find that while some algorithms eventually learn perfectly, this can be very slow unless the best price is fixed, in which case the error decreases at a steady rate. For buyers with a limited set of possible values, learning happens much faster than previously thought. Overall, their work provides a clearer picture of how learning speed changes depending on buyer behavior and pricing.

learning curvesrevenue maximizationBayes-consistent algorithmPAC learning frameworkgeneralization abilityvaluation distributionerror decay ratesingle-item auctionsample complexitydiscrete value distributions
Authors
Steve Hanneke, Alkis Kalavasis, Shay Moran, Grigoris Velegkas
Abstract
Learning curves are a fundamental primitive in supervised learning, describing how an algorithm's performance improves with more data and providing a quantitative measure of its generalization ability. Formally, a learning curve plots the decay of an algorithm's error for a fixed underlying distribution as a function of the number of training samples. Prior work on revenue-maximizing learning algorithms, starting with the seminal work of Cole and Roughgarden [STOC, 2014], adopts a distribution-free perspective, which parallels the PAC learning framework in learning theory. This approach evaluates performance against the hardest possible sequence of valuation distributions, one for each sample size, effectively defining the upper envelope of learning curves over all possible distributions, thus leading to error bounds that do not capture the shape of the learning curves. In this work we initiate the study of learning curves for revenue maximization and provide a near-complete characterization of their rate of decay in the basic setting of a single item and a single buyer. In the absence of any restriction on the valuation distribution, we show that there exists a Bayes-consistent algorithm, meaning that its learning curve converges to zero for any arbitrary valuation distribution as the number of samples $n \to \infty$. However, this convergence must be arbitrarily slow, even if the optimal revenue is finite. In contrast, if the optimal revenue is achieved by a finite price, then the optimal rate of decay is roughly $1/\sqrt{n}$. Finally, for distributions supported on discrete sets of values, we show that learning curves decay almost exponentially fast, a rate unattainable under the PAC framework.