A Discussion on Hyper parameter Tuning

Aug-13-2025, 16:23:36 GMT–Neural Information Processing Systems

Contextual bandit is a class of online learning problems that can be viewed as a simple reinforcement learning problem without transition. For a completely understanding of contextual bandit problems, we refer the readers to the Chapter 4 of [Bubeck et al., 2012]. Here we include the main idea for completeness. In contextual bandit problems, the agent needs to find out the best action given some observed context (a.k.a the optimal policy in reinforcement learning). Formally, we define S as the context set and K as the number of action.

machine learning, reinforcement learning, srld, (16 more...)

Neural Information Processing Systems

Aug-13-2025, 16:23:36 GMT

Conferences PDF

Add feedback

Industry:
- Energy > Oil & Gas
  - Upstream (0.46)
- Education > Focused Education
  - Special Education (0.44)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.44)

Duplicate Docs Excel Report

Title
023d0a5671efd29e80b4deef8262e297-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found