Oracle-EfficientAlgorithmsfor OnlineLinearOptimizationwithBanditFeedback
–Neural Information Processing Systems
We propose computationally efficient algorithms foronline linear optimization with bandit feedback, in which a player chooses anaction vectorfrom a given (possibly infinite) setA Rd, and then suffers a loss that can be expressed as a linear function in action vectors.
Neural Information Processing Systems
Feb-14-2026, 20:16:06 GMT
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > Canada
- Asia > Japan
- Technology: