Dietterich, Thomas G.



Learning Scripts as Hidden Markov Models

AAAI Conferences

Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including fillinggaps in the narratives and resolving ambiguous references. This paper proposes the first formal frameworkfor scripts based on Hidden Markov Models (HMMs). Our framework supports robust inference and learning algorithms, which are lacking in previous clustering models. We develop an algorithm for structure andparameter learning based on Expectation Maximizationand evaluate it on a number of natural datasets. The results show that our algorithm is superior to several informed baselines for predicting missing events in partialobservation sequences.


Reconstructing Velocities of Migrating Birds from Weather Radar – A Case Study in Computational Sustainability

AI Magazine

Bird migration occurs at the largest of global scales, but monitoring such movements can be challenging. In the US there is an operational network of weather radars providing freely accessible data for monitoring meteorological phenomena in the atmosphere. Individual radars are sensitive enough to detect birds, and can provide insight into migratory behaviors of birds at scales that are not possible using other sensors. Archived data from the WSR-88D network of US weather radars hold valuable and detailed information about the continent-scale migratory movements of birds over the last 20 years. However, significant technical challenges must be overcome to understand this information and harness its potential for science and conservation. We describe recent work on an AI system to quantify bird migration using radar data, which is part of the larger BirdCast project to model and forecast bird migration at large scales using radar, weather, and citizen science data.


Guiding Scientific Discovery with Explanations Using DEMUD

AAAI Conferences

In the era of large scientific data sets, there is an urgent need for methods to automatically prioritize data for review. At the same time, for any automated method to be adopted by scientists, it must make decisions that they can understand and trust. In this paper, we propose Discovery through Eigenbasis Modeling of Uninteresting Data (DEMUD), which uses principal components modeling and reconstruction error to prioritize data. DEMUD’s major advance is to offer domain-specific explanations for its prioritizations. We evaluated DEMUD’s ability to quickly identify diverse items of interest and the value of the explanations it provides. We found that DEMUD performs as well or better than existing class discovery methods and provides, uniquely, the first explanations for why those items are of interest. Further, in collaborations with planetary scientists, we found that DEMUD (1) quickly identifies very rare items of scientific value, (2) maintains high diversity in its selections, and (3) provides explanations that greatly improve human classification accuracy.


Incorporating Boosted Regression Trees into Ecological Latent Variable Models

AAAI Conferences

Important ecological phenomena are often observed indirectly. Consequently, probabilistic latent variable models provide an important tool, because they can include explicit models of the ecological phenomenon of interest and the process by which it is observed. However, existing latent variable methods rely on hand-formulated parametric models, which are expensive to design and require extensive preprocessing of the data. Nonparametric methods (such as regression trees) automate these decisions and produce highly accurate models. However, existing tree methods learn direct mappings from inputs to outputs — they cannot be applied to latent variable models. This paper describes a methodology for integrating nonparametric tree methods into probabilistic latent variable models by extending functional gradient boosting. The approach is presented in the context of occupancy-detection (OD) modeling, where the goal is to model the distribution of a species from imperfect detections. Experiments on 12 real and 3 synthetic bird species compare standard and tree-boosted OD models (latent variable models) with standard and tree-boosted logistic regression models (without latent structure). All methods perform similarly when predicting the observed variables, but the OD models learn better representations of the latent process. Most importantly, tree-boosted OD models learn the best latent representations when nonlinearities and interactions are present.


Reinforcement Learning Via Practice and Critique Advice

AAAI Conferences

We consider the problem of incorporating end-user advice into reinforcement learning (RL). In our setting, the learner alternates between practicing, where learning is based on actual world experience, and end-user critique sessions where advice is gathered. During each critique session the end-user is allowed to analyze a trajectory of the current policy and then label an arbitrary subset of the available actions as good or bad. Our main contribution is an approach for integrating all of the information gathered during practice and critiques in order to effectively optimize a parametric policy. The approach optimizes a loss function that linearly combines losses measured against the world experience and the critique data. We evaluate our approach using a prototype system for teaching tactical battle behavior in a real-time strategy game engine. Results are given for a significant evaluation involving ten end-users showing the promise of this approach and also highlighting challenges involved in inserting end-users into the RL loop.


Machine-Learning Research

AI Magazine

Machine-learning research has been making great progress in many directions. The four directions are (1) the improvement of classification accuracy by learning ensembles of classifiers, (2) methods for scaling up supervised learning algorithms, (3) reinforcement learning, and (4) the learning of complex stochastic models.


Machine-Learning Research

AI Magazine

Machine-learning research has been making great progress in many directions. This article summarizes four of these directions and discusses some current open problems. The four directions are (1) the improvement of classification accuracy by learning ensembles of classifiers, (2) methods for scaling up supervised learning algorithms, (3) reinforcement learning, and (4) the learning of complex stochastic models.