AITopics

We provide a novel, flexible, iterative refinement algorithm to automatically construct an approximate statespace representation for Markov Decision Processes (MDPs). Our approach leverages bisimulation metrics, which have been used in prior work to generate features to represent the state space of MDPs. We address a drawback of this approach, which is the expensive computation of the bisimulation metrics. We propose an algorithm to generate an iteratively improving sequence of state space partitions. Partial metric computations guide the representation search and provide much lower space and computational complexity, while maintaining strong convergence properties. We provide theoretical results guaranteeing convergence as well as experimental illustrations of the accuracy and savings (in time and memory usage) of the new algorithm, compared to traditional bisimulation metric computation.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Massachusetts (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Knowledge-Based Probabilistic Logic Learning

Odom, Phillip (Indiana University) | Khot, Tushar (University of Wisconsin) | Porter, Reid (Los Alamos National Laboratory) | Natarajan, Sriraam (Indiana University)

Advice giving has been long explored in artificial intelligence to build robust learning algorithms. We consider advice giving in relational domains where the noise is systematic. The advice is provided as logical statements that are then explicitly considered by the learning algorithm at every update. Our empirical evidence proves that human advice can effectively accelerate learning in noisy structured domains where so far humans have been merely used as labelers or as designers of initial structure of the model.

artificial intelligence, expert system, machine learning, (20 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.46)

Industry:

Media > Film (1.00)
Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.51)

Tighter Value Function Bounds for Bayesian Reinforcement Learning

Lee, Kanghoon (KAIST) | Kim, Kee-Eung (KAIST)

Bayesian reinforcement learning (BRL) provides a principled framework for optimal exploration-exploitation tradeoff in reinforcement learning. We focus on model based BRL, which involves a compact formulation of the optimal tradeoff from the Bayesian perspective. However, it still remains a computational challenge to compute the Bayes-optimal policy. In this paper, we propose a novel approach to compute tighter value function bounds of the Bayes-optimal value function, which is crucial for improving the performance of many model-based BRL algorithms. We then present how our bounds can be integrated into real-time AO* heuristic search, and provide a theoretical analysis on the impact of improved bounds on the search efficiency. We also provide empirical results on standard BRL domains that demonstrate the effectiveness of our approach.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation

Keller, Thomas (University of Freiburg) | Geißer, Florian (University of Freiburg)

Two other algorithms require the knowledge Markov Decision Processes (MDPs) offer a general framework of the optimal policy and its expected reward. We show to describe probabilistic planning problems of varying that the expected reward of the optimal policy is a lower complexity. The development of algorithms that act successfully bound for the expected performance of both strategies. in MDPs is important to many AI applications. Our final algorithm switches between the application of Since it is often impossible or intractable to evaluate MDP the optimal policy and the policy with the highest possible algorithms based on a theoretical analysis alone, the International outcome, which can be computed without notable overhead Probabilistic Planning Competition (IPPC) was introduced in the Trial-based Heuristic Tree Search (THTS) framework to allow a comparison based on experimental evaluation. (Keller and Helmert 2013). We show theoretically and empirically The idea is to approximate the quality of an MDP that all algorithms outperform the naïve base approach solver by performing a sequence of runs on a problem instance, that ignores the potential of optimizing evaluation and by using the average of the obtained results as runs in hindsight, and that it pays off to take suboptimal base an approximation of the expected reward.

algorithm, artificial intelligence, machine learning, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

An Improved Lower Bound for Bayesian Network Structure Learning

Fan, Xiannian (City University of New York) | Yuan, Changhe (City University of New York)

Several heuristic search algorithms such as A* and breadth-first branch and bound have been developed for learning Bayesian network structures that optimize a scoring function. These algorithms rely on a lower bound function called k-cycle conflict heuristic in guiding the search to explore the most promising search spaces. The heuristic takes as input a partition of the random variables of a data set; the importance of the partition opens up opportunities for further research. This work introduces a new partition method based on information extracted from the potential optimal parent sets (POPS) of the variables. Empirical results show that the new partition can significantly improve the efficiency and scalability of heuristic search-based structure learning algorithms.

artificial intelligence, graph, machine learning, (14 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Oregon > Benton County > Corvallis (0.04)
North America > United States > New York (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Submodular Surrogates for Value of Information

How should we gather information to make effective decisions? A classical answer to this fundamental problem is given by the decision-theoretic value of information. Unfortunately, optimizing this objective is intractable, and myopic (greedy) approximations are known to perform poorly. In this paper, we introduce DiRECt, an efficient yet near-optimal algorithm for nonmyopically optimizing value of information. Crucially, DiRECt uses a novel surrogate objective that is: (1) aligned with the value of information problem (2) efficient to evaluate and (3) adaptive submodular. This latter property enables us to utilize an efficient greedy optimization while providing strong approximation guarantees. We demonstrate the utility of our approach on four diverse case-studies: touch-based robotic localization, comparison-based preference learning, wild-life conservation management, and preference elicitation in behavioral economics. In the first application, we demonstrate DiRECt in closed-loop on an actual robotic platform.

artificial intelligence, decision region, machine learning, (17 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Chen, Suming Jeremiah (University of California, Los Angeles) | Choi, Arthur (University of California, Los Angeles) | Darwiche, Adnan (University of California, Los Angeles)

Value of Information Based on Decision Robustness

There are many criteria for measuring the value of information (VOI), each based on a different principle that is usually suitable for specific applications. We propose a new criterion for measuring the value of information, which values information that leads to robust decisions (i.e., ones that are unlikely to change due to new information). We also introduce an algorithm for Naive Bayes networks that selects features with maximal VOI under the new criteria. We discuss the application of the new criteria to classification tasks, showing how it can be used to tradeoff the budget, allotted for acquiring information, with the classification accuracy. In particular, we show empirically that the new criteria can reduce the expended budget significantly while reducing the classification accuracy only slightly. We also show empirically that the new criterion leads to decisions that are much more robust than those based on traditional VOI criteria, such as information gain and classification loss. This make the new criteria particularly suitable for certain decision making applications.

artificial intelligence, machine learning, sdp, (17 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > District of Columbia > Washington (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.78)

Optimal Cost Almost-Sure Reachability in POMDPs

Chatterjee, Krishnendu (IST Austria) | Chmelik, Martin (IST Austria) | Gupta, Raghav (IIT Bombay) | Kanodia, Ayush (IIT Bombay)

We consider partially observable Markov decision processes (POMDPs) with a set of target states and every transition is associated with an integer cost. The optimization objective we study asks to minimize the expected total cost till the target set is reached, while ensuring that the target set is reached almost-surely (with probability 1). We show that for integer costs approximating the optimal cost is undecidable. For positive costs, our results are as follows: (i) we establish matching lower and upper bounds for the optimal cost and the bound is double exponential; (ii) we show that the problem of approximating the optimal cost is decidable and present approximation algorithms developing on the existing algorithms for POMDPs with finite-horizon objectives. While the worst-case running time of our algorithm is double exponential, we present efficient stopping criteria for the algorithm and show experimentally that it performs well in many examples of interest.

artificial intelligence, machine learning, pomdp, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Europe > Austria (0.05)
Asia > India (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Buchman, David (University of British Columbia) | Poole, David (University of British Columbia)

Representing Aggregators in Relational Probabilistic Models

We consider the problem of, given a probabilistic model on a set of random variables, how to add a new variable that depends on the other variables, without changing the original distribution. In particular, we consider relational models (such as Markov logic networks (MLNs)), where we cannot directly define conditional probabilities. In relational models, there may be an unbounded number of parents in the grounding, and conditional distributions need to be defined in terms of aggregators. The question we ask is whether and when it is possible to represent conditional probabilities at all in various relational models. Some aggregators have been shown to be representable by MLNs, by adding auxiliary variables; however it was unknown whether they could be defined without auxiliary variables. For other aggregators, it was not known whether they can be represented by MLNs at all. We obtained surprisingly strong negative results on the capability of flexible undirected relational models such as MLNs to represent aggregators without affecting the original model's distribution. We provide a map of what aspects of the models, including the use of auxiliary variables and quantifiers, result in the ability to represent various aggregators. In addition, we provide proof techniques which can be used to facilitate future theoretic results on relational models, and demonstrate them on relational logistic regression (RLR).

artificial intelligence, bayesian inference, machine learning, (16 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Afshar, Hadi Mohasel (Australian National University, NICTA) | Sanner, Scott (NICTA, Australian National University) | Abbasnejad, Ehsan (Australian National University, NICTA)

Linear-Time Gibbs Sampling in Piecewise Graphical Models

Many real-world Bayesian inference problems such as preference learning or trader valuation modeling in financial markets naturally use piecewise likelihoods. Unfortunately, exact closed-form inference in the underlying Bayesian graphical models is intractable in the general case and existing approximation techniques provide few guarantees on both approximation quality and efficiency. While (Markov Chain) Monte Carlo methods provide an attractive asymptotically unbiased approximation approach, rejection sampling and Metropolis-Hastings both prove inefficient in practice, and analytical derivation of Gibbs samplers require exponential space and time in the amount of data. In this work, we show how to transform problematic piecewise likelihoods into equivalent mixture models and then provide a blocked Gibbs sampling approach for this transformed model that achieves an exponential-to-linear reduction in space and time compared to a conventional Gibbs sampler. This enables fast, asymptotically unbiased Bayesian inference in a new expressive class of piecewise graphical models and empirically requires orders of magnitude less time than rejection, Metropolis-Hastings, and conventional Gibbs sampling methods to achieve the same level of accuracy.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Banking & Finance (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)