AITopics | Reinforcement Learning

49f85a9ed090b20c8bed85a5923c669f-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 20:37:26 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Variance Reduced Policy Evaluation with Smooth Function Approximation

Hoi-To Wai, Mingyi Hong, Zhuoran Yang, Zhaoran Wang, Kexin Tang

Neural Information Processing SystemsOct-2-2025, 20:32:37 GMT

Policy evaluation with smooth and nonlinear function approximation has shown great potential for reinforcement learning. Compared to linear function approximation, it allows for using a richer class of approximation functions such as the neural networks. Traditional algorithms are based on two timescales stochastic approximation whose convergence rate is often slow.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.84)

Add feedback

The authors would like to thank all the three reviewers for their useful feedback and the area chair for handling this

Neural Information Processing SystemsOct-2-2025, 20:32:22 GMT

To address the reviewers' comments, upon acceptance of this paper, we will (i) include numerical experiment Some common concerns are as follows. Details of this experiment will be found in final version. Reviewer 1: We thank the reviewer for providing constructive and supportive comments. They will be corrected in the final version. Details will be provided in the final version.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.76)

Add feedback

When to Trust Your Model: Model-Based Policy Optimization

Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine

Neural Information Processing SystemsOct-2-2025, 20:18:45 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

The LoCA Regret: A Consistent Metric to Evaluate Model-Based Behavior in Reinforcement Learning -- Supplementary Material -- AT abular Experiments

Neural Information Processing SystemsOct-2-2025, 20:18:27 GMT

Here, we discuss some additional settings for the tabular experiments. The reason for this is that Sarsa(0.95), in contrast to MB-VI and MB-SU, is a multi-step Therefore, there is stochasticity in the update target even in deterministic environments due to exploration of the behavior policy. All methods used optimistic initialization. The pseudocode of the tabular, on-policy method used in Section 5.1 is shown in Algorithm 1. These estimates are updated at the end of the episode, using the data gathered during the episode.

experiment, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)

Add feedback