AITopics

We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions—discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goal-scoring and Platform domains.

optimization problem, parameterized action, soccer, (19 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country:

Africa (0.28)
North America > United States > North Carolina (0.14)

Industry: Leisure & Entertainment > Sports > Soccer (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks

Li, Chunyuan (Duke University) | Chen, Changyou (Duke University) | Carlson, David (Columbia University) | Carin, Lawrence (Duke University)

Effective training of deep neural networks suffers from two main issues. The first is that the parameter space of these models exhibit pathological curvature. Recent methods address this problem by using adaptive preconditioning for Stochastic Gradient Descent (SGD). These methods improve convergence by adapting to the local geometry of parameter space. A second issue is overfitting, which is typically addressed by early stopping. However, recent work has demonstrated that Bayesian model averaging mitigates this problem. The posterior can be sampled by using Stochastic Gradient Langevin Dynamics (SGLD). However, the rapidly changing curvature renders default SGLD methods inefficient. Here, we propose combining adaptive preconditioners with SGLD. In support of this idea, we give theoretical properties on asymptotic convergence and predictive risk. We also provide empirical results for Logistic Regression, Feedforward Neural Nets, and Convolutional Neural Nets, demonstrating that our preconditioned SGLD method gives state-of-the-art performance on these models.

bayesian inference, deep learning, neural network, (20 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report (0.54)
Instructional Material (0.46)

Rules for Choosing Societal Tradeoffs

Conitzer, Vincent (Duke University) | Freeman, Rupert (Duke University) | Brill, Markus (Duke University) | Li, Yuqian (Duke University)

We study the societal tradeoffs problem, where a set of voters each submit their ideal tradeoff value between each pair of activities (e.g., "using a gallon of gasoline is as bad as creating 2 bags of landfill trash"), and these are then aggregated into the societal tradeoff vector using a rule. We introduce the family of distance-based rules and show that these can be justified as maximum likelihood estimators of the truth. Within this family, we single out the logarithmic distance-based rule as especially appealing based on a social-choice-theoretic axiomatization. We give an efficient algorithm for executing this rule as well as an approximate hill climbing algorithm, and evaluate these experimentally.

Maximizing Revenue with Limited Correlation: The Cost of Ex-Post Incentive Compatibility

Albert, Michael (University of Texas at Austin) | Conitzer, Vincent (Duke University) | Lopomo, Giuseppe (Duke University)

In a landmark paper in the mechanism design literature, Cremer and McLean (1985) (CM for short) show that when a bidder’s valuation is correlated with an external signal, a monopolistic seller is able to extract the full social surplus as revenue. In the original paper and subsequent literature, the focus has been on ex-post incentive compatible (or IC) mechanisms, where truth telling is an ex-post Nash equilibrium. In this paper, we explore the implications of Bayesian versus ex-post IC in a correlated valuation setting. We generalize the full extraction result to settings that do not satisfy the assumptions of CM. In particular, we give necessary and sufficient conditions for full extraction that strictly relax the original conditions given in CM. These more general conditions characterize the situations under which requiring ex-post IC leads to a decrease in expected revenue relative to Bayesian IC. We also demonstrate that the expected revenue from the optimal ex-post IC mechanism guarantees at most a (|Θ| + 1)/4 approximation to that of a Bayesian IC mechanism, where |Θ| is the number of bidder types. Finally, using techniques from automated mechanism design, we show that, for randomly generated distributions, the average expected revenue achieved by Bayesian IC mechanisms is significantly larger than that for ex-post IC mechanisms.

artificial intelligence, game theory, mechanism, (18 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models

Li, Chunyuan (Duke University) | Chen, Changyou (Duke University) | Fan, Kai (Duke University) | Carin, Lawrence (Duke University)

Learning in deep models using Bayesian methods has generated significant attention recently. This is largely because of the feasibility of modern Bayesian methods to yield scalable learning and inference, while maintaining a measure of uncertainty in the model parameters. Stochastic gradient MCMC algorithms (SG-MCMC) are a family of diffusion-based sampling methods for large-scale Bayesian learning. In SG-MCMC, multivariate stochastic gradient thermostats (mSGNHT) augment each parameter of interest, with a momentum and a thermostat variable to maintain stationary distributions as target posterior distributions. As the number of variables in a continuous-time diffusion increases, its numerical approximation error becomes a practical bottleneck, so better use of a numerical integrator is desirable. To this end, we propose use of an efficient symmetric splitting integrator in mSGNHT, instead of the traditional Euler integrator. We demonstrate that the proposed scheme is more accurate, robust, and converges faster. These properties are demonstrated to be desirable in Bayesian deep learning. Extensive experiments on two canonical models and their deep extensions demonstrate that the proposed scheme improves general Bayesian posterior sampling, particularly for deep models.

artificial intelligence, bayesian inference, integrator, (19 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

AAAI ConferencesMar-16-2016

Solving DEC-POMDPs by Expectation Maximization of Value Function

Song, Zhao (Duke University) | Liao, Xuejun (Duke University) | Carin, Lawrence (Duke University)

We present a new algorithm called PIEM to approximately solve for the policy of an infinite-horizon decentralized partially observable Markov decision process (DEC-POMDP). The algorithm uses expectation maximization (EM) only in the step of policy improvement, with policy evaluation achieved by solving the Bellman's equation in terms of finite state controllers (FSCs). This marks a key distinction of PIEM from the previous EM algorithm of (Kumar and Zilberstein, 2010), i.e., PIEM directly operates on a DEC-POMDP without transforming it into a mixture of dynamic Bayes nets. Thus, PIEM precisely maximizes the value function, avoiding complicated forward/backward message passing and the corresponding computational and memory cost. To overcome local optima, we follow (Pajarinen and Peltonen, 2011) to solve the DEC-POMDP for a finite length horizon and use the resulting policy graph to initialize the FSCs. We solve the finite-horizon problem using a modified point-based policy generation (PBPG) algorithm, in which a closed-form solution is provided which was previously found by linear programming in the original PBPG. Experimental results on benchmark problems show that the proposed algorithms compare favorably to state-of-the-art methods.

algorithm, artificial intelligence, us government, (17 more...)

2016 AAAI Spring Symposium Series

Country: North America > United States (0.66)

Genre: Research Report (0.34)

Industry: Government > Regional Government > North America Government > United States Government (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

AAAI ConferencesNov-1-2015

Probabilistic Planning for Decentralized Multi-Robot Systems

Amato, Christopher (University of New Hampshire) | Konidaris, George (Duke University) | Omidshafiei, Shayegan (Massachusetts Institute of Technology) | Agha-mohammadi, Ali-akbar (Qualcomm Research) | How, Jonathan P. (Massachusetts Institute of Technology) | Kaelbling, Leslie P. (Massachusetts Institute of Technology)

Multi-robot systems are an exciting application domain for AI research and Dec-POMDPs, specifically. MacDec-POMDP methods can produce high-quality general solutions for realistic heterogeneous multi-robot coordination problems by automatically generating control and communication policies, given a model. In contrast to most existing multi-robot methods that are specialized to a particular problem class, our approach can synthesize policies that exploit any opportunities for coordination that are present in the problem, while balancing uncertainty, sensor information, and information about other agents.

artificial intelligence, machine learning, robot, (16 more...)

2015 AAAI Fall Symposium Series

Country: North America > United States > Massachusetts (0.17)

Industry: Transportation (0.31)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.98)

Reports on the 2014 AAAI Fall Symposium Series

Cohen, Adam B. (Independent Consultant) | Chernova, Sonia (Worcester Polytechnic Institute) | Giordano, James (Georgetown University Medical Center) | Guerin, Frank (University of Aberdeen) | Hauser, Kris (Duke University) | Indurkhya, Bipin (AGH University of Science and Technology) | Leonetti, Matteo (University of Texas at Austin) | Medsker, Larry (Siena College) | Michalowski, Martin (Adventium Labs) | Sonntag, Daniel (German Research Center for Artificial Intelligence) | Stojanov, Georgi (American University of Paris) | Tecuci, Dan G. (IBM Watson, Austin) | Thomaz, Andrea (Georgia Institute of Technology) | Veale, Tony (University College Dublin) | Waltinger, Ulli (Siemens Corporate Technology)

AI MagazineSep-28-2015

The AAAI 2014 Fall Symposium Series was held Thursday through Saturday, November 13–15, at the Westin Arlington Gateway in Arlington, Virginia adjacent to Washington, DC. The titles of the seven symposia were Artificial Intelligence for Human-Robot Interaction, Energy Market Prediction, Expanding the Boundaries of Health Informatics Using AI, Knowledge, Skill, and Behavior Transfer in Autonomous Robots, Modeling Changing Perspectives: Reconceptualizing Sensorimotor Experiences, Natural Language Access to Big Data, and The Nature of Humans and Machines: A Multidisciplinary Discourse. The highlights of each symposium are presented in this report.

artificial intelligence, Computer Engineering, symposium, (3 more...)

AI Magazine

Industry: Information Technology > Robotics & Automation (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Reports on the 2014 AAAI Fall Symposium Series

Cohen, Adam B. (Independent Consultant) | Chernova, Sonia (Worcester Polytechnic Institute) | Giordano, James (Georgetown University Medical Center) | Guerin, Frank (University of Aberdeen) | Hauser, Kris (Duke University) | Indurkhya, Bipin (AGH University of Science and Technology) | Leonetti, Matteo (University of Texas at Austin) | Medsker, Larry (Siena College) | Michalowski, Martin (Adventium Labs) | Sonntag, Daniel (German Research Center for Artificial Intelligence) | Stojanov, Georgi (American University of Paris) | Tecuci, Dan G. (IBM Watson, Austin) | Thomaz, Andrea (Georgia Institute of Technology) | Veale, Tony (University College Dublin) | Waltinger, Ulli (Siemens Corporate Technology)

AI MagazineSep-28-2015

The program also included six keynote presentations, a funding panel, a community panel, and multiple breakout sessions. The keynote presentations, given by speakers that have been working on AI for HRI for many years, focused on the larger intellectual picture of this subfield. Each speaker was asked to address, from his or her personal perspective, why HRI is an AI problem and how AI research can bring us closer to the reality of humans interacting with robots on everyday tasks. Speakers included Rodney Brooks (Rethink Robotics), Manuela Veloso (Carnegie Mellon University), Michael Goodrich (Brigham Young University), Benjamin Kuipers (University of Michigan), Maja Mataric (University of Southern California), and Brian Scassellati (Yale University).

renewable energy, symposium, us government, (21 more...)

AI Magazine

Country:

North America > United States > California (0.69)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable (1.00)
Health & Medicine > Therapeutic Area (0.94)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

AAAI ConferencesMar-6-2015

Cross-Modal Similarity Learning via Pairs, Preferences, and Active Supervision

Zhen, Yi (Georgia Institute of Technology) | Rai, Piyush (Duke University) | Zha, Hongyuan (Georgia Institute of Technology) | Carin, Lawrence (Duke University)

We present a probabilistic framework for learning pairwise similarities between objects belonging to different modalities, such as drugs and proteins, or text and images. Our framework is based on learning a binary code based representation for objects in each modality, and has the following key properties: (i) it can leverage both pairwise as well as easy-to-obtain relative preference based cross-modal constraints, (ii) the probabilistic framework naturally allows querying for the most useful/informative constraints, facilitating an active learning setting (existing methods for cross-modal similarity learning do not have such a mechanism), and (iii) the binary code length is learned from the data. We demonstrate the effectiveness of the proposed approach on two problems that require computing pairwise similarities between cross-modal object pairs: cross-modal link prediction in bipartite graphs, and hashing based cross-modal similarity search.

artificial intelligence, constraint, machine learning, (17 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)