Goto

Collaborating Authors

 Agents


Supplementary Materials for " Multi-Agent Meta-Reinforcement Learning " AT echnical Lemmas

Neural Information Processing Systems

From the three-points identity of the Bregman divergence (Lemma 3.1 of [9]), KL (x y) KL ( x y) = KL (x x) + ln x ln y,x x (12) The first term in (12) can be bounded by KL (x x) = By the Hรถlder's inequality, the second term in (12) is bounded as ln x ln y,x x ln x ln y Lemma 5. Consider a block diagonal matrix We prove the lemma via induction on N . This completes the induction proof.Lemma 6. We introduce one more notation before presenting the proof. This leads us to the initialization-dependent convergence rate of Algorithm 1, which we re-state and prove as follows. In addition, if we initialize the players' policies to be uniform policies, i.e., The rest of the proof follows by putting all the aforementioned results together.




Group Fairness in Peer Review

Neural Information Processing Systems

Large conferences such as NeurIPS and AAAI serve as crossroads of various AI fields, since they attract submissions from a vast number of communities. However, in some cases, this has resulted in a poor reviewing experience for some communities, whose submissions get assigned to less qualified reviewers outside of their communities. An often-advocated solution is to break up any such large conference into smaller conferences, but this can lead to isolation of communities and harm interdisciplinary research.



EDGI: Equivariant Diffusion for Planning with Embodied Agents Supplementary Material Anonymous Author(s) Affiliation Address email A Architecture details

Neural Information Processing Systems

We illustrate the architecture in Figure 1 in the main paper. We use a kernel size of 5. This is essentially an equivariant version of LayerNorm. In the geometric layers, the input state is split into scalar and vector components. The vector components are linearly transformed to reduce the number of channels to 16.