Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets
–Neural Information Processing Systems
We study a two-sided market, wherein, price-sensitive heterogeneous customers and servers arrive and join their respective queues. A compatible customer-server pair can then be matched by the platform, at which point, they leave the system. Our objective is to design pricing and matching algorithms that maximize the platform's profit, while maintaining reasonable queue lengths. As the demand and supply curves governing the price-dependent arrival rates may not be known in practice, we design a novel online-learning-based pricing policy and establish its near-optimality. In particular, we prove a tradeoff among three performance metrics: OpT1 γq regret, OpTγ{2q average queue length, and OpTγq maximum queue length for γ P p0,1{6s, significantly improving over existing results [1]. Moreover, barring the permissible range of γ, we show that this trade-off between regret and average queue length is optimal up to logarithmic factors under a class of policies, matching the optimal one as in [2] which assumes the demand and supply curves to be known. Our proposed policy has two noteworthy features: a dynamic component that optimizes the tradeoff between low regret and small queue lengths; and a probabilistic component that resolves the tension between obtaining useful samples for fast learning and maintaining small queue lengths.
Neural Information Processing Systems
Jun-22-2026, 22:02:28 GMT
- Country:
- North America > United States > Michigan (0.28)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education > Educational Setting > Online (0.70)
- Technology: