Appendix A Proof of Theorem

Feb-18-2024, 04:14:27 GMT–Neural Information Processing Systems

Both methods are able to establish sublinear regret and sublinear constraint violation under UCB. Moreover, with the aid of slackness (i.e., ") in the dual update, both methods can establish bounded or even zero constraint violation. For general exploration strategies beyond UCB, we tend to believe that convex-opt method has advantages over the current analysis in the Lyapunov-drift method, since the latter explicitly relies on optimism in both regret and constraint violation analysis. On the other hand, Lyapunov-drift method can easily handle an anytime slackness, say, "

apple, constraint violation, probability, (14 more...)

Neural Information Processing Systems

Feb-18-2024, 04:14:27 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)