DistributedInverseConstrainedReinforcement LearningforMulti-agentSystems

Neural Information Processing Systems 

We formally guarantee that the distributed learners asymptotically achieve consensus which belongs to the set of stationary points of the bi-level optimization problem.