AITopics

arXiv.org Artificial IntelligenceDec-26-2010

A Monte Carlo AIXI Approximation

Veness, Joel, Ng, Kee Siong, Hutter, Marcus, Uther, William, Silver, David

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

0909.0801

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
(2 more...)

arXiv.org Artificial IntelligenceNov-27-2010

Reinforcement Learning in Partially Observable Markov Decision Processes using Hybrid Probabilistic Logic Programs

Saad, Emad

We present a probabilistic logic programming framework to reinforcement learning, by integrating reinforce-ment learning, in POMDP environments, with normal hybrid probabilistic logic programs with probabilistic answer set seman-tics, that is capable of representing domain-specific knowledge. We formally prove the correctness of our approach. We show that the complexity of finding a policy for a reinforcement learning problem in our approach is NP-complete. In addition, we show that any reinforcement learning problem can be encoded as a classical logic program with answer set semantics. We also show that a reinforcement learning problem can be encoded as a SAT problem. We present a new high level action description language that allows the factored representation of POMDP. Moreover, we modify the original model of POMDP so that it be able to distinguish between knowledge producing actions and actions that change the environment.

logic & formal reasoning, machine learning, reinforcement, (18 more...)

arXiv.org Artificial Intelligence

1011.5951

Genre: Research Report (0.70)

Industry: Education > Focused Education > Special Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Sutton, Charles, McCallum, Andrew

An Introduction to Conditional Random Fields

arXiv.org Machine LearningNov-17-2010

Often we wish to predict a large number of variables that depend on each other as well as on other observed variables. Structured prediction methods are essentially a combination of classification and graphical modeling, combining the ability of graphical models to compactly model multivariate data with the ability of classification methods to perform prediction using large sets of input features. This tutorial describes conditional random fields, a popular probabilistic method for structured prediction. CRFs have seen wide application in natural language processing, computer vision, and bioinformatics. We describe methods for inference and parameter estimation for CRFs, including practical issues for implementing large scale CRFs. We do not assume previous knowledge of graphical modeling, so this tutorial is intended to be useful to practitioners in a wide variety of fields.

algorithm, neural network, optimization problem, (23 more...)

arXiv.org Machine Learning

1011.4088

Country:

North America > United States > Massachusetts (0.28)
Asia > Middle East (0.28)
Europe > Germany (0.27)
(3 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(6 more...)

Hierarchical Multimodal Planning for Pervasive Interaction

Lin, Yong (University of Texas at Arlington) | Makedon, Fillia ( University of Texas at Arlington )

Traditional dialogue management systems are tightly coupled with the sensing ability of a single computer. How to organize an interaction in pervasive environments to provide a friendly and integrated interface to users is an important issue. This requires a transition of the human-computer interaction (HCI) from tight coupling to loose coupling. This paper proposes a hierarchical multimodal framework for pervasive interactions. Our system is designed to remind the activities of daily living for individuals with cognitive impairments.The system is composed of Markov decision processes for activity planing, and multimodal partially observable Markov decision processes for action planning and executing. Empirical results demonstrate the hierarchical multimodal framework establishes a flexible mechanism for pervasive interaction systems.

interaction, pervasive interaction, pomdp, (17 more...)

Country:

North America > United States > Texas (0.04)
North America > Canada > British Columbia > East Kootenay Region > Fernie (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Modeling and Measuring Self-Regulated Learning in Teachable Agent Environments

Kinnebrew, John S. (Vanderbilt University) | Biswas, Gautam (Vanderbilt University) | Sulcer, William B. (Vanderbilt University)

Our learning by teaching environment has students take on the role and responsibilities of a teacher to a virtual student named Betty. The environment is structured so that successfully instructing their teachable agent requires the students to learn and understand science topics for themselves. This process is supported by adaptive scaffolding and feedback from the system. This feedback is instantiated through the interactions with the teachable agent and a mentor agent, named Mr. Davis. This paper provides an overview of two studies that were conducted with 5th grade science students and a description of the analysis techniques that we have developed for interpreting students’ activities in this learning environment.

agent, betty, student, (13 more...)

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > New Jersey > Bergen County > Mahwah (0.04)
North America > United States > Florida > Pinellas County > Clearwater (0.04)

Genre:

Instructional Material (0.93)
Research Report > New Finding (0.46)

Industry:

Education > Educational Setting (0.93)
Education > Assessment & Standards > Student Performance (0.68)
Education > Curriculum > Subject-Specific Education (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Automata Modeling for Cognitive Interference in Users' Relevance Judgment

Zhang, Peng (The Robert Gordon University) | Song, Dawei (The Robert Gordon University) | Hou, Yuexian (Tianjin University) | Wang, Jun (Robert Gordon University) | Bruza, Peter (Queensland University of Technology)

Quantum theory has recently been employed to further advance thetheory of information retrieval (IR). A challenging research topicis to investigate the so called quantum-like interference in users'relevance judgment process, where users are involved to judge therelevance degree of each document with respect to a given query. Inthis process, users' relevance judgment for the current document isoften interfered by the judgment for previous documents, due to theinterference on users' cognitive status. Research from cognitivescience has demonstrated some initial evidence of quantum-likecognitive interference in human decision making, which underpins theuser's relevance judgment process. This motivates us to model suchcognitive interference in the relevance judgment process, which inour belief will lead to a better modeling and explanation of userbehaviors in relevance judgement process for IR and eventually leadto more user-centric IR models. In this paper, we propose to useprobabilistic automaton (PA) and quantum finite automaton (QFA),which are suitable to represent the transition of user judgmentstates, to dynamically model the cognitive interference when theuser is judging a list of documents.

artificial intelligence, interference, machine learning, (16 more...)

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Scalable POMDPs for Diagnosis and Planning in Intelligent Tutoring Systems

Folsom-Kovarik, Jeremiah T. (University of Central Florida) | Sukthankar, Gita (University of Central Florida) | Schatz, Sae (University of Central Florida) | Nicholson, Denise (University of Central Florida)

A promising application area for proactive assistant agents is automated tutoring and training. Intelligent tutoring systems (ITSs) assist tutors and tutees by automating diagnosis and adaptive tutoring. These tasks are well modeled by a partially observable Markov decision process (POMDP) since it accounts for the uncertainty inherent in diagnosis. However, an important aspect of making POMDP solvers feasible for real-world problems is selecting appropriate representations for states, actions, and observations. This paper studies two scalable POMDP state and observation representations. State queues allow POMDPs to temporarily ignore less-relevant states. Observation chains represent information in independent dimensions using sequences of observations to reduce the size of the observation set. Preliminary experiments with simulated tutees suggest the experimental representations perform as well as lossless POMDPs, and can model much larger problems.

artificial intelligence, machine learning, pomdp, (18 more...)

Country:

North America > United States > Florida > Orange County > Orlando (0.05)
North America > United States > Florida > Hillsborough County > University (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.89)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Lison, Pierre (German Research Centre for Artificial Intelligence (DFKI GmbH)) | Kruijff, Geert-Jan M. (German Research Centre for Artificial Intelligence (DFKI)

Policy Activation for Open-Ended Dialogue Management

An important difficulty in developing spoken dialogue systems for robots is the open-ended nature of most interactions. Robotic agents must typically operate in complex, continuously changing environments which are difficult to model and do not provide any clear, predefined goal. Directly capturing this complexity in a single, large dialogue policy is thus inadequate. This paper presents a new approach which tackles the complexity of open-ended interactions by breaking it into a set of small, independent policies, which can be activated and deactivated at runtime by a dedicated mechanism. The approach is currently being implemented in a spoken dialogue system for autonomous robots.

artificial intelligence, machine learning, natural language, (16 more...)

Country: Europe > Germany > Saarland > Saarbrücken (0.05)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.55)

Matignon, Laetitia (University of Caen Basse Normandie) | Karami, Abir Beatrice (University of Caen Basse Normandie) | Mouaddib, Abdel-Illah (University of Caen Basse Normandie)

A Model for Verbal and Non-Verbal Human-Robot Collaboration

We are motivated by building a system for an autonomous robot companion that collaborates with a human partner for achieving a common mission. The objective of the robot is to infer the human's preferences upon the tasks of the mission so as to collaborate with the human by achieving human's non-favorite tasks. Inspired by recent researches about the recognition of human's intention, we propose a unified model that allows the robot to switch accurately between verbal and non-verbal interactions. Our system unifies an epistemic partially observable Markov decision process (POMDP) that is a human-robot spoken dialog system aiming at disambiguating the human's preferences and an intuitive human-robot collaboration consisting in inferring human's intention based on the observed human actions. The beliefs over human's preferences computed during the dialog are then reinforced in the course of the task execution by the intuitive interaction. Our unified model helps the robot inferring the human's preferences and deciding which tasks to perform to effectively satisfy these preferences. The robot is also able to adjust its plan rapidly in case of sudden changes in the human's preferences and to switch between both kind of interactions. Experimental results on a scenario inspired from robocup@home outline various specific behaviors of the robot during the collaborative mission.

artificial intelligence, machine learning, robot, (15 more...)

Country: Europe > France (0.04)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)