Goto

Collaborating Authors

 demand function


Dynamic Pricing and Learning with Bayesian Persuasion

Neural Information Processing Systems

We consider a novel dynamic pricing and learning setting where in addition to setting prices of products in sequential rounds, the seller also ex-ante commits to'advertising schemes'. That is, in the beginning of each round the seller can decide what kind of signal they will provide to the buyer about the product's quality upon realization. Using the popular Bayesian persuasion framework to model the effect of these signals on the buyers' valuation and purchase responses, we formulate the problem of finding an optimal design of the advertising scheme along with a pricing scheme that maximizes the seller's expected revenue. Without any apriori knowledge of the buyers' demand function, our goal is to design an online algorithm that can use past purchase responses to adaptively learn the optimal pricing and advertising strategy. We study the regret of the algorithm when compared to the optimal clairvoyant price and advertising scheme.


Beyond Demand Estimation: Consumer Surplus Evaluation via Cumulative Propensity Weights

Bian, Zeyu, Biggs, Max, Gao, Ruijiang, Qi, Zhengling

arXiv.org Machine Learning

This paper develops a practical framework for using observational data to audit the consumer surplus effects of AI-driven decisions, specifically in targeted pricing and algorithmic lending. Traditional approaches first estimate demand functions and then integrate to compute consumer surplus, but these methods can be challenging to implement in practice due to model misspecification in parametric demand forms and the large data requirements and slow convergence of flexible nonparametric or machine learning approaches. Instead, we exploit the randomness inherent in modern algorithmic pricing, arising from the need to balance exploration and exploitation, and introduce an estimator that avoids explicit estimation and numerical integration of the demand function. Each observed purchase outcome at a randomized price is an unbiased estimate of demand and by carefully reweighting purchase outcomes using novel cumulative propensity weights (CPW), we are able to reconstruct the integral. Building on this idea, we introduce a doubly robust variant named the augmented cumulative propensity weighting (ACPW) estimator that only requires one of either the demand model or the historical pricing policy distribution to be correctly specified. Furthermore, this approach facilitates the use of flexible machine learning methods for estimating consumer surplus, since it achieves fast convergence rates by incorporating an estimate of demand, even when the machine learning estimate has slower convergence rates. Neither of these estimators is a standard application of off-policy evaluation techniques as the target estimand, consumer surplus, is unobserved. To address fairness, we extend this framework to an inequality-aware surplus measure, allowing regulators and firms to quantify the profit-equity trade-off. Finally, we validate our methods through comprehensive numerical studies.


Dynamic Pricing with Monotonicity Constraint under Unknown Parametric Demand Model

Neural Information Processing Systems

We consider the Continuum Bandit problem where the goal is to find the optimal action under an unknown reward function, with an additional monotonicity constraint (or, markdown constraint) that requires that the action sequence be non-increasing. This problem faithfully models a natural single-product dynamic pricing problem, called markdown pricing, where the objective is to adaptively reduce the price over a finite sales horizon to maximize expected revenues. Jia et al '21 and Chen '21 independently showed a tight $T^{3/4}$ regret bound over $T$ rounds under *minimal* assumptions of unimodality and Lipschitzness in the reward (or, revenue) function. This bound shows that the demand learning in markdown pricing is harder than unconstrained (i.e., without the monotonicity constraint) pricing under unknown demand which suffers regret only of the order of $T^{2/3}$ under the same assumptions (Kleinberg '04). However, in practice the demand functions are usually assumed to have certain functional forms (e.g.


Contextual Dynamic Pricing with Heterogeneous Buyers

Lykouris, Thodoris, Nietert, Sloan, Okoroafor, Princewill, Podimata, Chara, Zimmert, Julian

arXiv.org Artificial Intelligence

We initiate the study of contextual dynamic pricing with a heterogeneous population of buyers, where a seller repeatedly posts prices (over $T$ rounds) that depend on the observable $d$-dimensional context and receives binary purchase feedback. Unlike prior work assuming homogeneous buyer types, in our setting the buyer's valuation type is drawn from an unknown distribution with finite support size $K_{\star}$. We develop a contextual pricing algorithm based on optimistic posterior sampling with regret $\widetilde{O}(K_{\star}\sqrt{dT})$, which we prove to be tight in $d$ and $T$ up to logarithmic terms. Finally, we refine our analysis for the non-contextual pricing case, proposing a variance-aware zooming algorithm that achieves the optimal dependence on $K_{\star}$.


We thank the reviewers for their insightful comments and suggestions on our paper

Neural Information Processing Systems

We thank the reviewers for their insightful comments and suggestions on our paper. Thanks for pointing out these related papers. The (private) buyer's valuation of this product remains fixed across time. With respect to Cohen et al. (and their tricks for robustness), their modified policy gets a regret of In comparison, we do not require such assumption. We will add "Conclusion" section in the revision and in our Related work section, we will add the following w.r.t


Appendix A. Proof of Theorem

Neural Information Processing Systems

Here we provide a proof for Theorem 1. Theorem 1. Now, it suffices to prove Eqs. (6), (8) and (9). This completes the proof of Eq. (8). Before we proceed to bound the total regret an prove Eq. This completes the proof of Eq. (9). By applying Hölder's inequality, we have for any a, b R The proof of this subcase is similar to subcase 2.2 and is omitted.