Goto

Collaborating Authors

 lce-a





High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Neural Information Processing Systems

Here we consider contextual policies that are a map from a discrete context to a set of continuous parameters . For example, video streaming and real-time conferencing systems use adaptive bitrate (ABR) algorithms to balance between video quality and uninterrupted playback.


highlight the technical novelty and contribution of our work. 3 Reviewer

Neural Information Processing Systems

We thank the reviewers for their critical assessment of our work, and for the overall positive reception. With stronger correlation there is a larger norm and more regularization. We will clarify this in the paper. There are 6 controller parameters and 5 connection qualities (from excellent to poor). The embedding sizes are set to be 1 in the simulation studies.