Appendix for " Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games " Table of Contents

Apr-24-2026, 13:10:26 GMT–Neural Information Processing Systems

A.1 Proof of Theorem 1 To prove Theorem 1, we need the help of the following Lemma See Proposition 7.1 in [3]. Now we can prove our Theorem 1. Proof. For games with only one step (normal-form games, functional-form games), there is only one fixed state. Therefore, the distribution of state-action is equivalent to the distribution of the action. A.2 Proof of Theorem 2 Let us restate our Theorem 2 Theorem 2. For a given empirical payoff matrix A RM N and the reward vector aM+1 for policy M + ||(I A>(A>))aM+1||2, (18) where (A>) is the Moore-Penrose pseudoinverse of A>, and σmin(A) is the minimum singular value of A. Proof. The last equation comes from the analytic calculation of min1>β=1 ||β (A>) aM+1||2 using Lagrangian.

artificial intelligence, iteration, machine learning, (12 more...)

Neural Information Processing Systems

Apr-24-2026, 13:10:26 GMT

Conferences PDF

Add feedback

Genre:
- Collection (0.40)

Industry:
- Leisure & Entertainment (0.94)

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
07bba581a2dd8d098a3be0f683560643-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found