Goto

Collaborating Authors

 lipschitz continuity




A Preliminaries on the Sinkhorn Potentials Lemma A.1 (Lemma A.2 elaborated)

Neural Information Processing Systems

One can observe that the Sinkhorn potentials are not unique. Under Assumption A.2, for a fixed pair of Assume Assumptions A.2 and A.3, and denote In this section, we provide several lemmas to show the Lipschitz continuity (w.r.t. the underlying These lemmas will be used in the convergence analysis and the mean field analysis for SD . Please see the proof in Appendix C.3. It would be an interesting future work to remove this factor . W e note that the Lemma B.1 is strictly stronger than preexisting results: (1) Proposition 13 of Feydy et al. [2019] only shows that the dual potentials are continuous (not Lipschitz continuous) This is strictly weaker than (i) of Lemma B.1.


Achieving Domain-Independent Certified Robustness via Knowledge Continuity

Neural Information Processing Systems

We present, a novel definition inspired by Lipschitz continuity which aims to certify the robustness of neural networks across input domains (such as continuous and discrete domains in vision and language, respectively). Most existing approaches that seek to certify robustness, especially Lipschitz continuity, lie within the continuous domain with norm and distribution-dependent guarantees. In contrast, our proposed definition yields certification guarantees that depend only on the loss function and the intermediate learned metric spaces of the neural network. These bounds are independent of domain modality, norms, and distribution. We further demonstrate that the expressiveness of a model class is not at odds with its knowledge continuity. This implies that achieving robustness by maximizing knowledge continuity should not theoretically hinder inferential performance. Finally, to complement our theoretical results, we present several applications of knowledge continuity such as regularization, a certification algorithm, and show that knowledge continuity can be used to localize vulnerable components of a neural network.


Compressive Visual Representations

Neural Information Processing Systems

Learning effective visual representations that generalize well without human supervision is a fundamental problem in order to apply Machine Learning to a wide variety of tasks. Recently, two families of self-supervised methods, contrastive learning and latent bootstrapping, exemplified by SimCLR and BYOL respectively, have made significant progress.


Learning in Stackelberg Mean Field Games: A Non-Asymptotic Analysis

Zeng, Sihan, Evans, Benjamin Patrick, Bhatt, Sujay, Ardon, Leo, Ganesh, Sumitra, Koppel, Alec

arXiv.org Artificial Intelligence

We study policy optimization in Stackelberg mean field games (MFGs), a hierarchical framework for modeling the strategic interaction between a single leader and an infinitely large population of homogeneous followers. The objective can be formulated as a structured bi-level optimization problem, in which the leader needs to learn a policy maximizing its reward, anticipating the response of the followers. Existing methods for solving these (and related) problems often rely on restrictive independence assumptions between the leader's and followers' objectives, use samples inefficiently due to nested-loop algorithm structure, and lack finite-time convergence guarantees. To address these limitations, we propose AC-SMFG, a single-loop actor-critic algorithm that operates on continuously generated Markovian samples. The algorithm alternates between (semi-)gradient updates for the leader, a representative follower, and the mean field, and is simple to implement in practice. We establish the finite-time and finite-sample convergence of the algorithm to a stationary point of the Stackelberg objective. To our knowledge, this is the first Stackelberg MFG algorithm with non-asymptotic convergence guarantees. Our key assumption is a "gradient alignment" condition, which requires that the full policy gradient of the leader can be approximated by a partial component of it, relaxing the existing leader-follower independence assumption. Simulation results in a range of well-established economics environments demonstrate that AC-SMFG outperforms existing multi-agent and MFG learning baselines in policy quality and convergence speed.


Convergence and stability of Q-learning in Hierarchical Reinforcement Learning

Manenti, Massimiliano, Iannelli, Andrea

arXiv.org Artificial Intelligence

Decision-making architectures have played a central role for decades [1] both in engineering and other domains, e.g., guidance, navigation and control of Apollo missions [2], chemical plants [3], smart grids [4], unmanned aerial vehicles [5], recommender systems [6], and algorithms [7]. Moreover, architectures are ubiquitous in nature, e.g., diversity in the nervous system enables humans to have fast and accurate sensorimotor control [8]. Reinforcement Learning (RL) is a framework in which an agent learns to make sequential decisions through interaction with an environment in order to maximize cumulative reward [9]. Decision-making architectures have also been proposed and studied in RL. Hierarchical Reinforcement Learning (HRL) is a subfield of RL that deals with hierarchical structures for decision-making agents. Prospective advantages include improved long-term credit assignment, continual learning, interpretability, and the integration of preexisting policies [10], [11].