Appendix for Softmax Deep Double Deterministic Policy Gradients Ling Pan

Feb-9-2026, 06:33:38 GMT–Neural Information Processing Systems

We demonstrate the smoothing effect of SD3 on the optimization landscape in this section, where experimental setup is the same as in Section 4.1 in the text for the comparative study of SD2 and Experimental details can be found in Section B.2. The performance comparison of SD3 and TD3 is shown in Figure 1(a), where SD3 significantly outperforms TD3. So far, we have demonstrated the smoothing effect of SD3 over TD3. Hyperparameters of DDPG and SD2 are summarized in Table 1. Assume that the actor is a local maximizer with respect to the critic.

artificial intelligence, machine learning, sd3, (16 more...)

Neural Information Processing Systems

Feb-9-2026, 06:33:38 GMT

Conferences PDF

Add feedback

Country:
- North America > Canada (0.04)
- Asia > Middle East
  - Jordan (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Duplicate Docs Excel Report

Title
884d247c6f65a96a7da4d1105d584ddd-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found