AITopics

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic finite state controllers, combining several advantages of gradient ascent (efficiency, search through restricted controller space) and policy iteration (less vulnerability to local optima).

belief state, controller, node, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.30)
Europe > Sweden > Stockholm > Stockholm (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(6 more...)

Industry: Government > Regional Government > North America Government > United States Government (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Chang, Yu-han, Ho, Tracey, Kaelbling, Leslie P.

All learning is Local: Multi-agent Learning in Global Reward Games

In large multiagent games, partial observability, coordination, and credit assignment persistently plague attempts to design good learning algorithms. We provide a simple and efficient algorithm that in part uses a linear system to model the world from a single agent's limited perspective, and takes advantage of Kalman filtering to allow an agent to construct a good training signal and learn an effective policy.

agent, optimal policy, reward signal, (15 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Parkes, David C., Singh, Satinder P.

An MDP-Based Approach to Online Mechanism Design

Online mechanism design (MD) considers the problem of providing incentives to implement desired system-wide outcomes in systems with self-interested agents that arrive and depart dynamically. Agents can choose to misrepresent their arrival and departure times, in addition to information about their value for different outcomes. We consider the problem of maximizing the total longterm value of the system despite the self-interest of agents. The online MD problem induces a Markov Decision Process (MDP), which when solved can be used to implement optimal policies in a truth-revealing Bayesian-Nash equilibrium.

agent, mechanism, vcg mechanism, (17 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Michigan (0.04)

Industry: Consumer Products & Services > Travel (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Theocharous, Georgios, Kaelbling, Leslie P.

Approximate Planning in POMDPs with Macro-Actions

Recent research has demonstrated that useful POMDP solutions do not require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and explore a new reinforcement learning algorithm over grid-points in belief space, which uses macro-actions and Monte Carlo updates of the Q-values. We apply the algorithm to a large scale robot navigation task and demonstrate that with temporal abstraction we can consider an even smaller part of the belief space, we can learn POMDP policies faster, and we can do information gathering more efficiently.

algorithm, belief space, pomdp, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Pineau, Joelle, Gordon, Geoffrey J., Thrun, Sebastian

Applying Metric-Trees to Belief-Point POMDPs

Recent developments in grid-based and point-based approximation algorithms for POMDPs have greatly improved the tractability of POMDP planning. These approaches operate on sets of belief points by individually learning a value function for each point. In reality, belief points exist in a highly-structured metric simplex, but current POMDP algorithms do not exploit this property. This paper presents a new metric-tree algorithm which can be used in the context of POMDP planning to sort belief points spatially, and then perform fast value function updates over groups of points. We present results showing that this approach can reduce computation in point-based POMDP algorithms for a wide range of problems.

algorithm, belief point, node, (15 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Wang, Xuerui, Hutchinson, Rebecca, Mitchell, Tom M.

Training fMRI Classifiers to Detect Cognitive States across Multiple Human Subjects

We consider learning to classify cognitive states of human subjects, based on their brain activity observed via functional Magnetic Resonance Imaging (fMRI). This problem is important because such classifiers constitute "virtual sensors" of hidden cognitive states, which may be useful in cognitive science research and clinical applications. In recent work, Mitchell, et al. [6,7,9] have demonstrated the feasibility of training such classifiers for individual human subjects (e.g., to distinguish whether the subject is reading an ambiguous or unambiguous sentence, or whether they are reading a noun or a verb). Here we extend that line of research, exploring how to train classifiers that can be applied across multiple human subjects, including subjects who were not involved in training the classifier. We describe the design of several machine learning approaches to training multiple-subject classifiers, and report experimental results demonstrating the success of these methods in learning cross-subject classifiers for two different fMRI data sets.

accuracy, brain, classifier, (15 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New York (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.90)
Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Hughes, Nicholas P., Tarassenko, Lionel, Roberts, Stephen J.

Markov Models for Automated ECG Interval Analysis

We examine the use of hidden Markov and hidden semi-Markov models for automatically segmenting an electrocardiogram waveform into its constituent waveform features. An undecimated wavelet transform is used to generate an overcomplete representation of the signal that is more appropriate for subsequent modelling. We show that the state durations implicit in a standard hidden Markov model are ill-suited to those of real ECG features, and we investigate the use of hidden semi-Markov models for improved state duration modelling.

ecg waveform, state duration, waveform, (13 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Hungary > Budapest > Budapest (0.04)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Classification with Hybrid Generative/Discriminative Models

Raina, Rajat, Shen, Yirong, McCallum, Andrew, Ng, Andrew Y.

Although discriminatively trained classifiers are usually more accurate when labeled training data is abundant, previous work has shown that when training data is limited, generative classifiers can outperform them. This paper describes a hybrid model in which a high-dimensional subset of the parameters are trained to maximize generative likelihood, and another, small, subset of parameters are discriminatively trained to maximize conditional likelihood. We give a sample complexity bound showing that in order to fit the discriminative parameters well, the number of training examples required depends only on the logarithm of the number of feature occurrences and feature set size. Experimental results show that hybrid models can provide lower test error and can produce better accuracy/coverage curves than either their purely generative or purely discriminative counterparts. We also discuss several advantages of hybrid models, and advocate further work in this area.

algorithm, classifier, logistic regression, (13 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Ji, Gang, Bilmes, Jeff A.

Necessary Intransitive Likelihood-Ratio Classifiers

In pattern classification tasks, errors are introduced because of differences between the true model and the one obtained via model estimation. Using likelihood-ratio based classification, it is possible to correct for this discrepancy by finding class-pair specific terms to adjust the likelihood ratio directly, and that can make class-pair preference relationships intransitive. In this work, we introduce new methodology that makes necessary corrections to the likelihood ratio, specifically those that are necessary to achieve perfect classification (but not perfect likelihood-ratio correction which can be overkill). The new corrections, while weaker than previously reported such adjustments, are analytically challenging since they involve discontinuous functions, therefore requiring several approximations. We test a number of these new schemes on an isolatedword speech recognition task as well as on the UCI machine learning data sets. Results show that by using the bias terms calculated in this new way, classification accuracy can substantially improve over both the baseline and over our previous results.

approximation, correction, likelihood ratio, (15 more...)

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.95)

Steck, Harald, Jaakkola, Tommi S.

Bias-Corrected Bootstrap and Model Uncertainty

The bootstrap has become a popular method for exploring model (structure) uncertainty. Our experiments with artificial and realworld data demonstrate that the graphs learned from bootstrap samples can be severely biased towards too complex graphical models. Accounting for this bias is hence essential, e.g., when exploring model uncertainty. We find that this bias is intimately tied to (well-known) spurious dependences induced by the bootstrap. The leading-order bias-correction equals one half of Akaike's penalty for model complexity. We demonstrate the effect of this simple bias-correction in our experiments. We also relate this bias to the bias of the plugin estimator for entropy, as well as to the difference between the expected test and training errors of a graphical model, which asymptotically equals Akaike's penalty (rather than one half).

bootstrap, experiment, network structure, (17 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)