Max Planck Institute for Intelligent Systems
Variational Bayes In Private Settings (VIPS)
Park, Mijung (Max Planck Institute for Intelligent Systems) | Foulds, James | Chaudhuri, Kamalika | Welling, Max
Many applications of Bayesian data analysis involve sensitive information such as personal documents or medical records, motivating methods which ensure that privacy is protected. We introduce a general privacy-preserving framework for Variational Bayes (VB), a widely used optimization-based Bayesian inference method. Our framework respects differential privacy, the gold-standard privacy criterion, and encompasses a large class of probabilistic models, called the Conjugate Exponential (CE) family. We observe that we can straightforwardly privatise VB's approximate posterior distributions for models in the CE family, by perturbing the expected sufficient statistics of the complete-data likelihood. For a broadly-used class of non-CE models, those with binomial likelihoods, we show how to bring such models into the CE family, such that inferences in the modified model resemble the private variational Bayes algorithm as closely as possible, using the Pólya-Gamma data augmentation scheme. The iterative nature of variational Bayes presents a further challenge since iterations increase the amount of noise needed. We overcome this by combining: (1) an improved composition method for differential privacy, called the moments accountant, which provides a tight bound on the privacy cost of multiple VB iterations and thus significantly decreases the amount of additive noise; and (2) the privacy amplification effect of subsampling mini-batches from large-scale data in stochastic learning. We empirically demonstrate the effectiveness of our method in CE and non-CE models including latent Dirichlet allocation, Bayesian logistic regression, and sigmoid belief networks, evaluated on real-world datasets.
Reports on the 2017 AAAI Spring Symposium Series
Bohg, Jeannette (Max Planck Institute for Intelligent Systems) | Boix, Xavier (Massachusetts Institute of Technology) | Chang, Nancy (Google) | Churchill, Elizabeth F. (Google) | Chu, Vivian (Georgia Institute of Technology) | Fang, Fei (Harvard University) | Feldman, Jerome (University of California at Berkeley) | González, Avelino J. (University of Central Florida) | Kido, Takashi (Preferred Networks in Japan) | Lawless, William F. (Paine College) | Montaña, José L. (University of Cantabria) | Ontañón, Santiago (Drexel University) | Sinapov, Jivko (University of Texas at Austin) | Sofge, Don (Naval Research Laboratory) | Steels, Luc (Institut de Biologia Evolutiva) | Steenson, Molly Wright (Carnegie Mellon University) | Takadama, Keiki (University of Electro-Communications) | Yadav, Amulya (University of Southern California)
Reports on the 2017 AAAI Spring Symposium Series
Bohg, Jeannette (Max Planck Institute for Intelligent Systems) | Boix, Xavier (Massachusetts Institute of Technology) | Chang, Nancy (Google) | Churchill, Elizabeth F. (Google) | Chu, Vivian (Georgia Institute of Technology) | Fang, Fei (Harvard University) | Feldman, Jerome (University of California at Berkeley) | González, Avelino J. (University of Central Florida) | Kido, Takashi (Preferred Networks in Japan) | Lawless, William F. (Paine College) | Montaña, José L. (University of Cantabria) | Ontañón, Santiago (Drexel University) | Sinapov, Jivko (University of Texas at Austin) | Sofge, Don (Naval Research Laboratory) | Steels, Luc (Institut de Biologia Evolutiva) | Steenson, Molly Wright (Carnegie Mellon University) | Takadama, Keiki (University of Electro-Communications) | Yadav, Amulya (University of Southern California)
It is also important to remember that having a very sharp distinction of AI A rise in real-world applications of AI has stimulated for social good research is not always feasible, and significant interest from the public, media, and policy often unnecessary. While there has been significant makers. Along with this increasing attention has progress, there still exist many major challenges facing come a media-fueled concern about purported negative the design of effective AIbased approaches to deal consequences of AI, which often overlooks the with the difficulties in real-world domains. One of the societal benefits that AI is delivering and can deliver challenges is interpretability since most algorithms for in the near future. To address these concerns, the AI for social good problems need to be used by human symposium on Artificial Intelligence for the Social end users. Second, the lack of access to valuable data Good (AISOC-17) highlighted the benefits that AI can that could be crucial to the development of appropriate bring to society right now. It brought together AI algorithms is yet another challenge. Third, the researchers and researchers, practitioners, experts, data that we get from the real world is often noisy and and policy makers from a wide variety of domains.
Policy Search with High-Dimensional Context Variables
Tangkaratt, Voot (The University of Tokyo) | Hoof, Herke van (McGill University) | Parisi, Simone (Technical University of Darmstadt) | Neumann, Gerhard (University of Lincoln) | Peters, Jan (Max Planck Institute for Intelligent Systems) | Sugiyama, Masashi (The University of Tokyo)
Direct contextual policy search methods learn to improve policy parameters and simultaneously generalize these parameters to different context or task variables. However, learning from high-dimensional context variables, such as camera images, is still a prominent problem in many real-world tasks. A naive application of unsupervised dimensionality reduction methods to the context variables, such as principal component analysis, is insufficient as task-relevant input may be ignored. In this paper, we propose a contextual policy search method in the model-based relative entropy stochastic search framework with integrated dimensionality reduction. We learn a model of the reward that is locally quadratic in both the policy parameters and the context variables. Furthermore, we perform supervised linear dimensionality reduction on the context variables by nuclear norm regularization. The experimental results show that the proposed method outperforms naive dimensionality reduction via principal component analysis and a state-of-the-art contextual policy search method.
Learning to Select and Generalize Striking Movements in Robot Table Tennis
Muelling, Katharina (Max Planck Institute for Intelligent Systems) | Kober, Jens (Max Planck Institute for Intelligent Systems) | Kroemer, Oliver (Technische Universitaet Darmstadt) | Peters, Jan (Technische Universitaet Darmstadt)
Learning new motor tasks autonomously from interaction with a human being is an important goal for both robotics and machine learning. However, when moving beyond basic skills, most monolithic machine learning approaches fail to scale. In this paper, we take the task of learning table tennis as an example and present a new framework which allows a robot to learn cooperative table tennis from interaction with a human. Therefore, the robot first learns a set of elementary table tennis hitting movements from a human teacher by kinesthetic teach-in, which is compiled into a set of dynamical system motor primitives (DMPs). Subsequently, the system generalizes these movements to a wider range of situations using our mixture of motor primitives (MoMP) approach. The resulting policy enables the robot to select appropriate motor primitives as well as to generalize between them. Finally, the robot plays with a human table tennis partner and learns online to improve its behavior.
Balancing Safety and Exploitability in Opponent Modeling
Wang, Zhikun (Max Planck Institute for Intelligent Systems) | Boularias, Abdeslam (Max Planck Institute for Intelligent Systems) | Mülling, Katharina (Max Planck Institute for Intelligent Systems) | Peters, Jan (Max Planck Institute for Intelligent Systems)
Opponent modeling is a critical mechanism in repeated games. It allows a player to adapt its strategy in order to better respond to the presumed preferences of his opponents. We introduce a new modeling technique that adaptively balances exploitability and risk reduction. An opponent’s strategy is modeled with a set of possible strategies that contain the actual strategy with a high probability. The algorithm is safe as the expected payoff is above the minimax payoff with a high probability, and can exploit the opponents’ preferences when sufficient observations have been obtained. We apply them to normal-form games and stochastic games with a finite number of stages. The performance of the proposed approach is first demonstrated on repeated rock-paper-scissors games. Subsequently, the approach is evaluated in a human-robot table-tennis setting where the robot player learns to prepare to return a served ball. By modeling the human players, the robot chooses a forehand, backhand or middle preparation pose before they serve. The learned strategies can exploit the opponent’s preferences, leading to a higher rate of successful returns.
Modeling Opponent Actions for Table-Tennis Playing Robot
Wang, Zhikun (Max Planck Institute for Intelligent Systems) | Boularias, Abdeslam (Max Planck Institute for Intelligent Systems) | Mülling, Katharina (Max Planck Institute for Intelligent Systems) | Peters, Jan (Max Planck Institute for Intelligent Systems)
Opponent modeling is a critical mechanism in repeated games. It allows a player to adapt its strategy in order to better respond to the presumed preferences of its opponents. We introduce a modeling technique that adaptively balances safety and exploitability. The opponent's strategy is modeled with a set of possible strategies that contains the actual one with high probability. The algorithm is safe as the expected payoff is above the minimax payoff with high probability, and can exploit the opponent's preferences when sufficient observations are obtained. We apply the algorithm to a robot table-tennis setting where the robot player learns to prepare to return a served ball. By modeling the human players, the robot chooses a forehand, backhand or middle preparation pose before they serve. The learned strategies can exploit the opponent's preferences, leading to a higher rate of successful returns.