Oracle-EfficientAlgorithmsfor OnlineLinearOptimizationwithBanditFeedback

Neural Information Processing Systems 

We propose computationally efficient algorithms foronline linear optimization with bandit feedback, in which a player chooses anaction vectorfrom a given (possibly infinite) setA Rd, and then suffers a loss that can be expressed as a linear function in action vectors.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found