Faster Deep Reinforcement Learning with Slower Online Network
–Neural Information Processing Systems
Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issues that arise when performing bootstrapping.
Neural Information Processing Systems
Nov-15-2025, 06:23:18 GMT
- Country:
- Asia
- Middle East > Jordan (0.04)
- Russia (0.04)
- South Korea > Seoul
- Seoul (0.04)
- Europe
- Asia
- Industry:
- Leisure & Entertainment > Games (1.00)
- Technology: