Optimal Regularized Online Allocation by Adaptive Re-Solving

Ma, Wanteng, Cao, Ying, Tsang, Danny H. K., Xia, Dong

arXiv.org Artificial Intelligence 

Online resource allocation seeks to maximize the total rewards in an online service system that is subject to resource constraints. As an exemplary model for sequential decision-making, online allocation has drawn considerable attention in recent decades. Meanwhile, it is strongly connected to other online problems such as revenue management (Talluri et al., 2004), online linear programming (Agrawal et al., 2014), and ads bidding problems (Lee et al., 2013), to name but a few. Online allocation finds applications in diverse fields, e.g., computer science and operation research. Oftentimes, online allocation problems feature resource constraints that are either hard (Mehta et al., 2007) or soft (Mahdavi et al., 2012), with different constraint capacities. The goal of a decision maker is to maximize the total rewards (revenue, utility) function by a real-time decision policy that enforces each of the resource constraints. So far, existing literature on online allocation mostly focused on additively separable objectives, i.e., the objective function only involves the total rewards that can be simply described as the cumulative rewards by time (e.g., Mehta et al. (2007); Devanur and Hayes (2009); Balseiro and Gur (2019)). While a separable objective is favorable for tracking additive total rewards, it falls short of describing globally non-separable quantities such as total resource consumption or average actions. For instance, the average action (Agrawal and Devanur, 2014) in online advertising measures the amount of underdelivery of impressions.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found