Basar, Tamer
Decentralized Nonconvex Optimization with Guaranteed Privacy and Accuracy
Wang, Yongqiang, Basar, Tamer
Privacy protection and nonconvexity are two challenging problems in decentralized optimization and learning involving sensitive data. Despite some recent advances addressing each of the two problems separately, no results have been reported that have theoretical guarantees on both privacy protection and saddle/maximum avoidance in decentralized nonconvex optimization. We propose a new algorithm for decentralized nonconvex optimization that can enable both rigorous differential privacy and saddle/maximum avoiding performance. The new algorithm allows the incorporation of persistent additive noise to enable rigorous differential privacy for data samples, gradients, and intermediate optimization variables without losing provable convergence, and thus circumventing the dilemma of trading accuracy for privacy in differential privacy design. More interestingly, the algorithm is theoretically proven to be able to efficiently { guarantee accuracy by avoiding} convergence to local maxima and saddle points, which has not been reported before in the literature on decentralized nonconvex optimization. The algorithm is efficient in both communication (it only shares one variable in each iteration) and computation (it is encryption-free), and hence is promising for large-scale nonconvex optimization and learning involving high-dimensional optimization parameters. Numerical experiments for both a decentralized estimation problem and an Independent Component Analysis (ICA) problem confirm the effectiveness of the proposed approach.
The Confluence of Networks, Games and Learning
Li, Tao, Peng, Guanze, Zhu, Quanyan, Basar, Tamer
Recent years have witnessed significant advances in technologies and services in modern network applications, including smart grid management, wireless communication, cybersecurity as well as multi-agent autonomous systems. Considering the heterogeneous nature of networked entities, emerging network applications call for game-theoretic models and learning-based approaches in order to create distributed network intelligence that responds to uncertainties and disruptions in a dynamic or an adversarial environment. This paper articulates the confluence of networks, games and learning, which establishes a theoretical underpinning for understanding multi-agent decision-making over networks. We provide an selective overview of game-theoretic learning algorithms within the framework of stochastic approximation theory, and associated applications in some representative contexts of modern network systems, such as the next generation wireless communication networks, the smart grid and distributed machine learning. In addition to existing research works on game-theoretic learning over networks, we highlight several new angles and research endeavors on learning in games that are related to recent developments in artificial intelligence. Some of the new angles extrapolate from our own research interests. The overall objective of the paper is to provide the reader a clear picture of the strengths and challenges of adopting game-theoretic learning methods within the context of network systems, and further to identify fruitful future research directions on both theoretical and applied studies.
A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
Suttle, Wesley, Yang, Zhuoran, Zhang, Kaiqing, Wang, Zhaoran, Basar, Tamer, Liu, Ji
In this work we develop a new off-policy actor-critic algorithm that performs policy improvement with convergence guarantees in the multi-agent setting using function approximation. To achieve this, we extend the method of emphatic temporal differences (ETD(λ)) to the multi-agent setting with provable convergence under linear function approximation, and we also derive a novel off-policy policy gradient theorem for the multi-agent setting. Using these new results, we develop our two-timescale algorithm, which uses ETD(λ) to perform policy evaluation for the critic step at a faster timescale and policy gradient ascent using emphatic weightings for the actor step at a slower timescale. We also provide convergence guarantees for the actor step. Our work builds on recent advances in three main areas: multi-agent on-policy actor-critic methods, emphatic temporal difference learning for off-policy policy evaluation, and the use of emphatic weightings in off-policy policy gradient methods.
A Game Theoretical Error-Correction Framework for Secure Traffic-Sign Classification
Sayin, Muhammed O., Lin, Chung-Wei, Kang, Eunsuk, Shiraishi, Shinichi, Basar, Tamer
We introduce a game theoretical error-correction framework to design classification algorithms that are reliable even in adversarial environments, with a specific focus on traffic-sign classification. Machine learning algorithms possess inherent vulnerabilities against maliciously crafted inputs especially at high dimensional input spaces. We seek to achieve reliable and timely performance in classification by redesigning the input space physically to significantly lower dimensions. Traffic-sign classification is an important use-case enabling the redesign of the inputs since traffic-signs have already been designed for their easy recognition by human drivers. We encode the original input samples to, e.g., strings of bits, through error-correction methods that can provide certain distance guarantees in-between any two different encoded inputs. And we model the interaction between the defense and the adversary as a game. Then, we analyze the underlying game using the concept of hierarchical equilibrium, where the defense strategies are designed by taking into account the best possible attack against them. At large scale, for computational simplicity, we provide an approximate solution, where we transform the problem into an efficient linear program with substantially small size compared to the original size of the entire input space. Finally, we examine the performance of the proposed scheme over different traffic-sign classification scenarios.