Review for NeurIPS paper: Agnostic Q -learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity

Feb-8-2025, 16:23:24 GMT–Neural Information Processing Systems

Weaknesses: The proof, as described by the authors themselves, depend on the assumption on the gap optimality. The relationship between the approximation error and this optimality gap is crucial, a larger approximation error requires a larger gap to ensure the favorable properties. It is not entirely clear whether these bounds are meaningful in practice. Secondly, the algorithm for the general case requires an oracle to determine the most uncertain action given a state for the approximation family F. While it is argued that a similar oracle is used in previous work, it is not clear whether this is more realistic than previous work dismissed by the authors in related work ("Know-What-It-Knows" oracle in Li et al. 2011). The proof applies only to deterministic systems, restricting its application significantly.

approximation error and sample complexity, machine learning, reinforcement learning, (7 more...)

Neural Information Processing Systems

Feb-8-2025, 16:23:24 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.76)
  - Representation & Reasoning > Uncertainty
    - Fuzzy Logic (0.40)