AITopics

Kalman Filtering is a computational tool with widespread applications in robotics, financial and weather forecasting, environmental engineering and defense. Given observation and state transition models, the Kalman Filter (KF) recursively estimates the state variables of a dynamic system. However, the KF requires a cubic time matrix inversion operation at every timestep which prevents its application in domains with large numbers of state variables. We propose Relational Gaussian Models to represent and model dynamic systems with large numbers of variables efficiently. Furthermore, we devise an exact lifted Kalman Filtering algorithm which takes only linear time in the number of random variables at every timestep. We prove that our algorithm takes linear time in the number of state variables even when individual observations apply to each variable. To our knowledge, this is the first lifted (linear time) algorithm for filtering with continuous dynamic relational models.

random variable, relational atom, timestep, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Probabilistic Goal Markov Decision Processes

Xu, Huan (National University of Singapore) | Mannor, Shie (Technion)

In contrast to the studied in single-period optimization [Miller and Wagner, standard approach that studies the expected performance, 1965; Prékopa, 1970]. However, little has been done in we consider the policy that maximizes the context of sequential decision problem including MDPs. the probability of achieving a predetermined target The standard approaches in risk-averse MDPs include maximization performance, a criterion we term probabilistic of expected utility function [Bertsekas, 1995], goal Markov decision processes. We show that and optimization of a coherent risk measure [Riedel, 2004; this problem is NPhard, but can be solved using a Le Tallec, 2007]. Both approaches lead to formulations that pseudo-polynomial algorithm. We further consider can not be solved in polynomial time, except for special a variant dubbed "chance-constraint Markov decision cases including exponential utility function [Chung and Sobel, problems," that treats the probability of achieving 1987], piecewise linear utility function with a single target performance as a constraint instead of the break down point [Liu and Koenig, 2005], and risk measures maximizing objective. This variant is NPhard, but that can be reduced to robust MDPs satisfying the socalled can be solved in pseudo-polynomial time.

mdp, probabilistic goal mdp, probability, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Israel (0.04)
North America > United States > New York (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Spaan, Matthijs T. J. (Institute for Systems and Robotics, Instituto Superior Técnico) | Oliehoek, Frans A. (Massachusetts Institute of Technology) | Amato, Christopher (Aptima, Inc)

Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion

Planning under uncertainty for multiagent systems can be formalized as a decentralized partially observable Markov decision process. We advance the state of the art for optimal solution of this model, building on the Multiagent A* heuristic search method. A key insight is that we can avoid the full expansion of a search node that generates a number of children that is doubly exponential in the node's depth. Instead, we incrementally expand the children only when a next child might have the highest heuristic value. We target a subsequent bottleneck by introducing a more memory-efficient representation for our heuristic functions. Proof is given that the resulting algorithm is correct and experiments demonstrate a significant speedup over the state of the art, allowing for optimal solutions over longer horizons for many benchmark problems.

expansion, representation, zilberstein, (17 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Woburn (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Ramírez, Miquel (Universitat Pompeu Fabra) | Geffner, Hector (ICREA and Universitat Pompeu Fabra)

Goal Recognition over POMDPs: Inferring the Intention of a POMDP Agent

Plan recognition is the problem of inferring the goals and plans of an agent from partial observations of her behavior. Recently, it has been shown that the problem can be formulated and solved using planners, reducing plan recognition to plan generation. In this work, we extend this model-based approach to plan recognition to the POMDP setting, where actions are stochastic and states are partially observable. The task is to infer a probability distribution over the possible goals of an agent whose behavior results from a POMDP model. The POMDP model is shared between agent and observer except for the true goal of the agent that is hidden to the observer. The observations are action sequences O that may contain gaps as some or even most of the actions done by the agent may not be observed. We show that the posterior goal distribution P ( G | O ) can be computed from the value function V G ( b ) over beliefs b generated by the POMDP planner for each possible goal G. Some extensions of the basic framework are discussed, and a number of experiments are reported.

agent, pomdp, recognition, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Kim, Dongho (Korea Advanced Institute of Science and Technology) | Lee, Jaesong (Korea Advanced Institute of Science and Technology) | Kim, Kee-Eung (Korea Advanced Institute of Science and Technology) | Poupart, Pascal (University of Waterloo)

Point-Based Value Iteration for Constrained POMDPs

Constrained partially observable Markov decision processes (CPOMDPs) extend the standard POMDPs by allowing the specification of constraints on some aspects of the policy in addition to the optimality objective for the value function. CPOMDPs have many practical advantages over standard POMDPs since they naturally model problems involving limited resource or multiple objectives. In this paper, we show that the optimal policies in CPOMDPs can be randomized, and present exact and approximate dynamic programming methods for computing randomized optimal policies. While the exact method requires solving a minimax quadratically constrained program (QCP) in each dynamic programming update, the approximate method utilizes the point-based value update with a linear program (LP). We show that the randomized policies are significantly better than the deterministic ones. We also demonstrate that the approximate point-based method is scalable to solve large problems.

admissible cost, constraint, cpomdp, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country: North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Barry, Jennifer L. (Massachusetts Institute of Technology) | Kaelbling, Leslie Pack (Massachusetts Institute of Technology) | Lozano-Perez, Tomas (Massachusetts Institute of Technology)

DetH*: Approximate Hierarchical Solution of Large Markov Decision Processes

This paper presents an algorithm for finding approximately optimal policies in very large Markov decision processes by constructing a hierarchical model and then solving it approximately. It exploits factored representations to achieve compactness and efficiency and to discover connectivity properties of the domain. We provide a bound on the quality of the solutions and give asymptotic analysis of the runtimes; in addition we demonstrate performance on a collection of very large domains. Results show that the quality of resulting policies is very good and the total running times, for both creating and solving the hierarchy, are significantly less than for an optimal factored MDP solver.

hierarchy, primitive state, value function, (17 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Sample Efficient On-Line Learning of Optimal Dialogue Policies with Kalman Temporal Differences

Pietquin, Olivier (SUPELEC / UMI 2958) | Geist, Matthieu (SUPELEC) | Chandramohan, Senthilkumar (SUPELEC)

Designing dialog policies for voice-enabled interfaces is a tailoring job that is most often left to natural language processing experts. This job is generally redone for every new dialog task because cross-domain transfer is not possible. For this reason, machine learning methods for dialog policy optimization have been investigated during the last 15 years. Especially, reinforcement learning (RL) is now part of the state of the art in this domain. Standard RL methods require to test more or less random changes in the policy on users to assess them as improvements or degradations. This is called on policy learning. Nevertheless, it can result in system behaviors that are not acceptable by users. Learning algorithms should ideally infer an optimal strategy by observing interactions generated by a non-optimal but acceptable strategy, that is learning off-policy. In this contribution, a sample-efficient, online and off-policy reinforcement learning algorithm is proposed to learn an optimal policy from few hundreds of dialogues generated with a very simple handcrafted policy.

algorithm, dialogue, optimal policy, (10 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(4 more...)

Genre: Research Report (0.68)

Industry: Education > Educational Setting > Online (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Collective Semantic Role Labeling for Tweets with Clustering

Liu, Xiaohua (Microsoft Research Asia, HIT) | Li, Kuan (Chongqing University) | Zhou, Ming (Microsoft Research Asia) | Xiong, Zhongyang (Chongqing University)

As tweets has become a comprehensive repository of fresh information, Semantic Role Labeling (SRL) for tweets has aroused great research interests because of its center role in a wide range of tweet related studies such as fine-grained information extraction, sentiment analysis and summarization. However, the fact that a tweet is often too short and informal to provide sufficient information poses a main challenge. To tackle this challenge, we propose a new method to collectively label similar tweets. The underlying idea is to exploit similar tweets to make up for the lack of information in a tweet. Specifically, similar tweets are first grouped together by clustering. Then for each cluster a two-stage labeling is conducted: One labeler conducts SRL to get statistical information, such as the predicate/argument/role triples that occur frequently, from its highly confidently labeled results; then in the second stage, another labeler performs SRL with such statistical information to refine the results. Experimental results on a human annotated dataset show that our approach remarkably improves SRL by 3.1% F1.

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

South America > Chile (0.05)
Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Semantic Relationship Discovery with Wikipedia Structure

Bu, Fan (Tsinghua University) | Hao, Yu (Tsinghua University) | Zhu, Xiaoyan (Tsinghua University)

Thanks to the idea of social collaboration, Wikipedia has accumulated vast amount of semi-structured knowledge in which the link structure reflects human's cognition on semantic relationship to some extent. In this paper, we proposed a novel method RCRank to jointly compute concept-concept relatedness and concept-category relatedness base on the assumption that information carried in concept-concept links and concept-category links can mutually reinforce each other. Different from previous work, RCRank can not only find semantically related concepts but also interpret their relations by categories. Experimental results on concept recommendation and relation interpretation show that our method substantially outperforms classical methods.

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > Germany (0.05)
Europe > France (0.04)
Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
(7 more...)

Genre: Research Report (0.34)

Industry: Government (0.31)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Visual Task Inference Using Hidden Markov Models

Abolhassani, Amin Haji (McGill University) | Clark, James J. (McGill University)

It has been known for a long time that visual task, such as reading, counting and searching, greatly influences eye movement patterns. Perhaps the best known demonstration of this is the celebrated study of Yarbus showing that different eye movement trajectories emerge depending on the visual task that the viewers are given. The objective of this paper is to develop an inverse Yarbus process whereby we can infer the visual task by observing the measurements of a viewer’s eye movements while executing the visual task. The method we are proposing is to use Hidden Markov Models (HMMs) to create a probabilistic framework to infer the viewer’s task from eye movements.

probability, trajectory, visual task, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)