A Hyperparameter Settings of RD

Feb-15-2026, 13:03:00 GMT–Neural Information Processing Systems

In this section, we describe details about hyperparameter setting of RD. SAC-N-Unc and TD3-N-Unc, M is set to 1/10 of the total training steps. To ensure fairness, algorithms employing RD are implemented using CORL repository [54]. By modifying the original SAC/TD3 algorithm to employ a critic ensemble of number N and incorporate an uncertainty regularization term within the policy update process, we derive these backbone algorithms. Additionally, using RD with fewer Q ensembles can achieve similar or even better results than the backbone methods using more Q ensembles, indicating its potential in reducing computing resource consumption.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Feb-15-2026, 13:03:00 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.48)

Duplicate Docs Excel Report

Title
802a4350ca4fced76b13b8b320af1543-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found