Goto

Collaborating Authors

 pmax


Balancing Gradient and Hessian Queries in Non-Convex Optimization

Neural Information Processing Systems

We develop optimization methods which offer new trade-offs between the number of gradient and Hessian computations needed to compute the critical point of a nonconvex function. We provide a method that for a twice-differentiable f: Rd R with L2-Lipschitz Hessian, an input initial point with -bounded sub-optimality, and a sufficiently small ฯต > 0, outputs an ฯต-critical point, i.e., a point xsuch that


ABayesian Approach to Contextual Dynamic Pricing using the Proportional Hazards Model with Discrete Price Data

Neural Information Processing Systems

Dynamic pricing algorithms typically assume continuous price variables, which may not reflect real-world scenarios where prices are often discrete. This paper demonstrates that leveraging discrete price information within a semi-parametric model can substantially improve performance, depending on the size of the support set of the price variable relative to the time horizon. Specifically, we propose a novel semi-parametric contextual dynamic pricing algorithm, namely BayesCoxCP, based on a Bayesian approach to the Cox proportional hazards model. Our theoretical analysis establishes high-probability regret bounds that adapt to the sparsity level ฮณ, proving that our algorithm achieves a regret upper bound of eO(T(1+ฮณ)/2 + dT) for ฮณ < 1/3 and eO(T2/3 + dT) for ฮณ 1/3, where ฮณ represents the sparsity of the price grid relative to the time horizon T. Through numerical experiments, we demonstrate that our proposed algorithm significantly outperforms an existing method, particularly in scenarios with sparse discrete price points.


Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

arXiv.org Machine Learning

We study contextual dynamic pricing in a semiparametric scalar-index valuation model where the latent value is $v_t=ฮผ_\ast(\mathsf c_t)+ฮพ_t$, with an unknown utility map $ฮผ_\ast$ and an unknown additive noise distribution. The key decision object is the one-dimensional oracle price map $u\mapsto p^\ast(u)$ induced by the scalar index $u=ฮผ_\ast(\mathsf c)$ and the noise tail. Under the $ฮฒ$-Hรถlder smoothness of the tail function for $ฮฒ\geq 2$ and a revenue-geometry condition that gives a unique, stable, interior maximizer, this oracle map is itself $(ฮฒ-1)$-smooth. We exploit such structure through $\mathsf{ORBIT}$, a modular coarse-to-fine policy that takes a scalar pilot index as input, localizes a benchmark price in each active bin, and learns a local polynomial approximation of the oracle map inside a trust region via bandit convex optimization. For the baseline linear utility model $ฮผ_\ast(\mathsf c)=\mathsf c^\topฮธ_\ast$, an adaptive elliptical exploration scheme constructs the required scalar pilot online without distributional assumptions on the contexts. The resulting policy achieves regret $\widetilde{O}\big(T^{\frac{2ฮฒ-1}{4ฮฒ-3}}+\sqrt{dT}\big)$. For fixed $d$, we establish a matching lower bound in the horizon dependence, unveiling that the nonparametric oracle-map learning term is minimax sharp. The same scalar-pilot interface also yields extensions to sparse high-dimensional linear utility and nonparametric Hรถlder utility.


ARelatedWork

Neural Information Processing Systems

Incontrast,our work is concerned with an overall limit on the total amount of information an agent may acquire fromtheenvironment and,inturn,howthattranslates intoitsselection ofafeasible learning target.


ContextualDynamicPricingwith Unknown Noise: Explore-then-UCBStrategyandImproved Regrets

Neural Information Processing Systems

A lot of work has been done for this problem with known noise. In this paper, we consider a contextual dynamic pricing problem under a linear customer valuation model with an unknown market noise distributionF.



ARelatedWork

Neural Information Processing Systems

Incontrast,our work is concerned with an overall limit on the total amount of information an agent may acquire fromtheenvironment and,inturn,howthattranslates intoitsselection ofafeasible learning target.