Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
–Neural Information Processing Systems
Network outputs can change indirectly to unexpected values after any random batch update for input data not included in the batch, called churn in this paper.
Neural Information Processing Systems
Oct-9-2025, 20:05:38 GMT