PessimismMeets Invariance: ProvablyEfficient OfflineMean-FieldMulti-AgentRL

Neural Information Processing Systems 

Most existing results only focus on online settings, in which agents can interact with the environment during training.