Online Network Revenue Management Using Thompson Sampling

Author Abstract

We consider a network revenue management problem where an online retailer aims to maximize revenue from multiple products with limited inventory. As common in practice, the retailer does not know the expected demand at each price and must learn the demand information from sales data. We propose an efficient and effective dynamic pricing algorithm, which builds upon the Thompson sampling algorithm used for multi-armed bandit problems by incorporating inventory constraints into the pricing decisions. Our algorithm proves to have both strong theoretical performance guarantees as well as promising numerical performance results when compared to other algorithms developed for the same setting. More broadly, our paper contributes to the literature on the multi-armed bandit problem with resource constraints, since our algorithm applies directly to this setting when the inventory constraints are interpreted as general resource constraints.

Paper Information

Full Working Paper Text
Working Paper Publication Date: September 2015
HBS Working Paper Number: 16-031
Faculty Unit(s): Technology and Operations Management

Online Network Revenue Management Using Thompson Sampling

Author Abstract

Paper Information

Struggling With a Big Management Decision? Start by Asking What Really Matters

Why Progress on Immigration Might Soften Labor Pains

What's Enough to Make Us Happy?

Why Boeing’s Problems with the 737 MAX Began More Than 25 Years Ago

Why Work Rituals Bring Teams Together and Create More Meaning

Sign up for our weekly newsletter