Doubly Mild Generalization for Offline Reinforcement Learning Yixiu Mao 1, Qi Wang 1, Y un Qu

Mar-21-2025, 03:13:09 GMT–Neural Information Processing Systems

Offline Reinforcement Learning (RL) suffers from the extrapolation error and value overestimation. From a generalization perspective, this issue can be attributed to the over-generalization of value functions or policies towards out-of-distribution (OOD) actions.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Mar-21-2025, 03:13:09 GMT

Conferences PDF

Add feedback

Country:
- Europe > Czechia (0.14)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Education > Educational Setting > Online (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)