I2Q: AFullyDecentralizedQ-LearningAlgorithm
–Neural Information Processing Systems
The modeling of ideal transition function inI2Q isfully decentralized and independent from the learned policies of other agents, helping I2Q be free from non-stationarity and learn the optimal policy.
Neural Information Processing Systems
Feb-10-2026, 07:04:57 GMT
- Technology: