Goto

Collaborating Authors

 proofoflemma4


LearningSocialWelfareFunctions

Neural Information Processing Systems

Consider a standard decision making setting that includes a set of possible actions (decisions or policies), and a set of individuals who assign utilities to the actions.


ALawofIteratedLogarithmforMulti-Agent ReinforcementLearning

Neural Information Processing Systems

In contrast, the mathematics needed to analyze such schemes is what forms the focus in Stochastic Approximation (SA) theory [2, 4]. More generally, SA refers to an iterative scheme that helps find zeroes or optimal points of a function, for which only noisy evaluationsarepossible.



637de5e2a7a77f741b0b84bd61c83125-Supplemental-Conference.pdf

Neural Information Processing Systems

A.1 ProofofTheorem3.1 The proof can be found in (Hardt et al., 2016). We provide the proof in Appendix for reference. Firstly, it is easy to see that h1(w) and h2(w) are both η-approximate β-gradient Lipschitz, which satisfiesinequatilyinEq. (A.1). A.5 ProofofTheorem5.1 The proof follows the standard techniques for uniform stability. We need to replace the nonexpansive property used in standard analysis by the approximately non-expansive property.