Cold-Start Reinforcement Learning with Softmax Policy Gradient
–Neural Information Processing Systems
The exposure-bias problem has recently received attention in neural-network settings with the "data as demonstrator" [
Neural Information Processing Systems
Nov-21-2025, 13:58:53 GMT
- Country:
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Genre:
- Research Report (0.46)
- Technology: