Goto

Collaborating Authors

 pmax


Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

arXiv.org Machine Learning

We study contextual dynamic pricing in a semiparametric scalar-index valuation model where the latent value is $v_t=ฮผ_\ast(\mathsf c_t)+ฮพ_t$, with an unknown utility map $ฮผ_\ast$ and an unknown additive noise distribution. The key decision object is the one-dimensional oracle price map $u\mapsto p^\ast(u)$ induced by the scalar index $u=ฮผ_\ast(\mathsf c)$ and the noise tail. Under the $ฮฒ$-Hรถlder smoothness of the tail function for $ฮฒ\geq 2$ and a revenue-geometry condition that gives a unique, stable, interior maximizer, this oracle map is itself $(ฮฒ-1)$-smooth. We exploit such structure through $\mathsf{ORBIT}$, a modular coarse-to-fine policy that takes a scalar pilot index as input, localizes a benchmark price in each active bin, and learns a local polynomial approximation of the oracle map inside a trust region via bandit convex optimization. For the baseline linear utility model $ฮผ_\ast(\mathsf c)=\mathsf c^\topฮธ_\ast$, an adaptive elliptical exploration scheme constructs the required scalar pilot online without distributional assumptions on the contexts. The resulting policy achieves regret $\widetilde{O}\big(T^{\frac{2ฮฒ-1}{4ฮฒ-3}}+\sqrt{dT}\big)$. For fixed $d$, we establish a matching lower bound in the horizon dependence, unveiling that the nonparametric oracle-map learning term is minimax sharp. The same scalar-pilot interface also yields extensions to sparse high-dimensional linear utility and nonparametric Hรถlder utility.


ARelatedWork

Neural Information Processing Systems

Incontrast,our work is concerned with an overall limit on the total amount of information an agent may acquire fromtheenvironment and,inturn,howthattranslates intoitsselection ofafeasible learning target.


ContextualDynamicPricingwith Unknown Noise: Explore-then-UCBStrategyandImproved Regrets

Neural Information Processing Systems

A lot of work has been done for this problem with known noise. In this paper, we consider a contextual dynamic pricing problem under a linear customer valuation model with an unknown market noise distributionF.



ARelatedWork

Neural Information Processing Systems

Incontrast,our work is concerned with an overall limit on the total amount of information an agent may acquire fromtheenvironment and,inturn,howthattranslates intoitsselection ofafeasible learning target.