Vijayan, Sushant
Online Bidding under RoS Constraints without Knowing the Value
Vijayan, Sushant, Feng, Zhe, Padmanabhan, Swati, Shanmugam, Karthikeyan, Suggala, Arun, Wang, Di
This introduces a Online advertising, a multi-billion dollar industry, relies on realtime challenging exploration-exploitation dilemma: the advertiser must auctions to connect advertisers with users. These auctions, balance exploring different bids to estimate impression values with triggered by user queries or website visits, allow advertisers to exploiting current knowledge to bid effectively. To address this, we bid for advertising slots, such as prominent placements on search propose a novel Upper Confidence Bound (UCB)-style algorithm engine results pages or in social media feeds. Advertisers aim to that carefully manages this trade-off. Via a rigorous theoretical maximize their returns, measured in conversions or other relevant analysis, we prove that our algorithm achieves ( log(|B|)) metrics, by carefully determining their bids while adhering to regret and constraint violation, where is the number of bidding budget constraints and desired return-on-spend (RoS) targets. To rounds and B is the domain of possible bids. This establishes the achieve this, a wide array of bidding strategies have been developed, first optimal regret and constraint violation bounds for bidding in leveraging techniques from optimization, online learning, and game the online setting with unknown impression values. Moreover, our theory to maximize advertiser utility [2, 7, 11, 23, 29, 30, 36, 47, 52].
Best arm identification in rare events
Bhattacharjee, Anirban, Vijayan, Sushant, Juneja, Sandeep K
We consider the best arm identification problem in the stochastic multi-armed bandit framework where each arm has a tiny probability of realizing large rewards while with overwhelming probability the reward is zero. A key application of this framework is in online advertising where click rates of advertisements could be a fraction of a single percent and final conversion to sales, while highly profitable, may again be a small fraction of the click rates. Lately, algorithms for BAI problems have been developed that minimise sample complexity while providing statistical guarantees on the correct arm selection. As we observe, these algorithms can be computationally prohibitive. We exploit the fact that the reward process for each arm is well approximated by a Compound Poisson process to arrive at algorithms that are faster, with a small increase in sample complexity. We analyze the problem in an asymptotic regime as rarity of reward occurrence reduces to zero, and reward amounts increase to infinity. This helps illustrate the benefits of the proposed algorithm. It also sheds light on the underlying structure of the optimal BAI algorithms in the rare event setting.