Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn

Oct-9-2025, 20:05:38 GMT–Neural Information Processing Systems

Network outputs can change indirectly to unexpected values after any random batch update for input data not included in the batch, called churn in this paper.

deviation, reinforcement learning, value and policy, (14 more...)

Neural Information Processing Systems

Oct-9-2025, 20:05:38 GMT

Conferences PDF

Country:
- North America > Canada
  - Quebec > Montreal (0.04)
- Europe > Portugal
  - Braga > Braga (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report
  - New Finding (0.93)
  - Experimental Study (0.93)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn

Similar Docs Excel Report more

Title	Similarity	Source
None found