A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents
–Neural Information Processing Systems
In multiagent domains, coping with non-stationary agents that change behaviors from time to time is a challenging problem, where an agent is usually required to be able to quickly detect the other agent's policy during online interaction, and then adapt its own policy accordingly.
Neural Information Processing Systems
Mar-16-2026, 23:01:05 GMT
- Technology: