Safe and Efficient Online Convex Optimization with Linear Budget Constraints and Partial Feedback

Liu, Shanqi, Liu, Xin

arXiv.org Artificial Intelligence 

However, such "anytime safe projection" methods Online Convex Optimization (OCO) provides a versatile may encounter three potential challenges when dealing with framework for studying online decision-making in dynamic budget constraints: 1) they often require a substantial initial and uncertain environments [1]-[3]. Within this framework, period to explore and learn the consumption matrix; 2) determining a learner continuously adapts its decisions to minimize a the "correct" safe constraint set based on an estimated loss function or maximize a utility function while interacting consumption matrix is difficult and they are very likely to with the environment in real-time. OCO has wide-ranging be overly conservative ensures safety but degrades performance; applications, including resource allocation in network systems 3) the projection-based methods (e.g., projected online [4]-[8], load balancing in server systems [9]-[11], online gradient descent) may require heavy computation because it advertising [12], [13], and personalized healthcare [14], [15]. is equivalent to solving a constrained quadratic optimization In OCO framework, the learner chooses a decision x