Industry
Exploiting Contextual Independence In Probabilistic Inference
Bayesian belief networks have grown to prominence because they provide compact representations for many problems for which probabilistic inference is appropriate, and there are algorithms to exploit this compactness. The next step is to allow compact representations of the conditional probabilities of a variable given its parents. In this paper we present such a representation that exploits contextual independence in terms of parent contexts; which variables act as parents may depend on the value of other variables. The internal representation is in terms of contextual factors (confactors) that is simply a pair of a context and a table. The algorithm, contextual variable elimination, is based on the standard variable elimination algorithm that eliminates the non-query variables in turn, but when eliminating a variable, the tables that need to be multiplied can depend on the context. This algorithm reduces to standard variable elimination when there is no contextual independence structure to exploit. We show how this can be much more efficient than variable elimination when there is structure to exploit. We explain why this new method can exploit more structure than previous methods for structured belief network inference and an analogous algorithm that uses trees.
Monte Carlo Methods for Tempo Tracking and Rhythm Quantization
The on tin uous hidden v ariables denote the temp o. Ex-a t omputation of p osterior features su h as the MAP state is in tra table in this mo del lass, so w e in tro du e Mon te Carlo metho ds for in tegration and optimization. The metho ds an b e applied in b oth online and bat h s enarios su h as temp o tra king and trans ription and are th us p oten tially useful in a n um b er of m usi appli ations su h as adaptiv e automati a ompanimen t, s ore t yp esetting and m usi information retriev al. 1. Ho w ev er, when op erating on sampled audio data from p olyphoni a ousti al signals, extra tion of a s ore-lik e des ription is a v ery hallenging auditory s ene analysis task (V er o e, Gardner, & S heirer, 1998). In this pap er, w e fo us on a subproblem in m usi -ir, where w e assume that exa t timing information of notes is a v ailable, for example as a stream of MIDI 1 ev en ts from a digital k eyb oard. One example is automati s ore t yp esetting, 1. Musi al Instrumen ts Digital In terfa e. Ea h time a k ey is pressed, a MIDI k eyb oard generates a short message on taining pit h and k ey v elo it y . In on v en tional m usi notation, the onset time of ea h note is impli itly represen ted b y the um ulativ e sum of durations of previous notes. Durations are en o ded b y simple rational n um b ers (e.g., quarter note, eigh th note), onsequen tly all ev en ts in m usi are pla ed on a dis rete grid. This is due to the fa t that m usi ians in tro du e in ten tional (and unin ten tional) deviations from a me hani al pres ription. F or example timing of ev en ts an b e delib erately dela y ed or pushed. Moreo v er, the temp o an u tuate b y slo wing do wn or a elerating. In fa t, su h deviations are natural asp e ts of expressiv e p erforman e; in the absen e of these, m usi tends to sound rather dull and me hani al. On the other hand, if these deviations are not a oun ted for during trans ription, resulting s ores ha v e often v ery p o or qualit y . Robust and fast quan tization and temp o tra king is also an imp ortan t requiremen t for in tera tiv e p erforman e systems; appli ations that \listen" to a p erformer for generating an a ompanimen t or impro visation in real time (Raphael, 2001b; Thom, 2000). A t last, su h mo dels are also useful in m usi ology for systemati study and hara terization of expressiv e timing b y prin ipled analysis of existing p erforman e data. F rom a theoreti al p ersp e tiv e, sim ultaneous quan tization and temp o tra king is a \ hi k en-and-egg" problem: the quan tization dep ends up on the in tended temp o in terpre-tation and the temp o in terpretation dep ends up on the quan tization. Apparen tly, h uman listeners an resolv e this am biguit y (in most ases) without an y eort.
Translation of Pronominal Anaphora between English and Spanish: Discrepancies and Evaluation
This paper evaluates the different tasks carried out in the translation of pronominal anaphora in a machine translation (MT) system. The MT interlingua approach named AGIR (Anaphora Generation with an Interlingua Representation) improves upon other proposals presented to date because it is able to translate intersentential anaphors, detect co-reference chains, and translate Spanish zero pronouns into English---issues hardly considered by other systems. The paper presents the resolution and evaluation of these anaphora problems in AGIR with the use of different kinds of knowledge (lexical, morphological, syntactic, and semantic). The translation of English and Spanish anaphoric third-person personal pronouns (including Spanish zero pronouns) into the target language has been evaluated on unrestricted corpora. We have obtained a precision of 80.4% and 84.8% in the translation of Spanish and English pronouns, respectively. Although we have only studied the Spanish and English languages, our approach can be easily extended to other languages such as Portuguese, Italian, or Japanese.
Inferring 3D Articulated Models for Box Packaging Robot
Yang, Heran, Low, Tiffany, Cong, Matthew, Saxena, Ashutosh
Given a point cloud, we consider inferring kinematic models of 3D articulated objects such as boxes for the purpose of manipulating them. While previous work has shown how to extract a planar kinematic model (often represented as a linear chain), such planar models do not apply to 3D objects that are composed of segments often linked to the other segments in cyclic configurations. We present an approach for building a model that captures the relation between the input point cloud features and the object segment as well as the relation between the neighboring object segments. We use a conditional random field that allows us to model the dependencies between different segments of the object. We test our approach on inferring the kinematic structure from partial and noisy point cloud data for a wide variety of boxes including cake boxes, pizza boxes, and cardboard cartons of several sizes. The inferred structure enables our robot to successfully close these boxes by manipulating the flaps.
High-dimensional covariance estimation based on Gaussian graphical models
Zhou, Shuheng, Rutimann, Philipp, Xu, Min, Buhlmann, Peter
Undirected graphs are often used to describe high dimensional distributions. Under sparsity conditions, the graph can be estimated using $\ell_1$-penalization methods. We propose and study the following method. We combine a multiple regression approach with ideas of thresholding and refitting: first we infer a sparse undirected graphical model structure via thresholding of each among many $\ell_1$-norm penalized regression functions; we then estimate the covariance matrix and its inverse using the maximum likelihood estimator. We show that under suitable conditions, this approach yields consistent estimation in terms of graphical structure and fast convergence rates with respect to the operator and Frobenius norm for the covariance matrix and its inverse. We also derive an explicit bound for the Kullback Leibler divergence.
Gaussian Process Regression with a Student-t Likelihood
Jylänki, Pasi, Vanhatalo, Jarno, Vehtari, Aki
This paper considers the robust and efficient implementation of Gaussian process regression with a Student-t observation model. The challenge with the Student-t model is the analytically intractable inference which is why several approximative methods have been proposed. The expectation propagation (EP) has been found to be a very accurate method in many empirical studies but the convergence of the EP is known to be problematic with models containing non-log-concave site functions such as the Student-t distribution. In this paper we illustrate the situations where the standard EP fails to converge and review different modifications and alternative algorithms for improving the convergence. We demonstrate that convergence problems may occur during the type-II maximum a posteriori (MAP) estimation of the hyperparameters and show that the standard EP may not converge in the MAP values in some difficult cases. We present a robust implementation which relies primarily on parallel EP updates and utilizes a moment-matching-based double-loop algorithm with adaptively selected step size in difficult cases. The predictive performance of the EP is compared to the Laplace, variational Bayes, and Markov chain Monte Carlo approximations.
Machine Learning Markets
Prediction markets show considerable promise for developing flexible mechanisms for machine learning. Here, machine learning markets for multivariate systems are defined, and a utility-based framework is established for their analysis. This differs from the usual approach of defining static betting functions. It is shown that such markets can implement model combination methods used in machine learning, such as product of expert and mixture of expert approaches as equilibrium pricing models, by varying agent utility functions. They can also implement models composed of local potentials, and message passing methods. Prediction markets also allow for more flexible combinations, by combining multiple different utility functions. Conversely, the market mechanisms implement inference in the relevant probabilistic models. This means that market mechanism can be utilized for implementing parallelized model building and inference for probabilistic modelling.
Interactive Execution Monitoring of Agent Teams
Berry, P., Lee, T. J., Wilkins, D. E.
There is an increasing need for automated support for humans monitoring the activity of distributed teams of cooperating agents, both human and machine. We characterize the domain-independent challenges posed by this problem, and describe how properties of domains influence the challenges and their solutions. We will concentrate on dynamic, data-rich domains where humans are ultimately responsible for team behavior. Thus, the automated aid should interactively support effective and timely decision making by the human. We present a domain-independent categorization of the types of alerts a plan-based monitoring system might issue to a user, where each type generally requires different monitoring techniques. We describe a monitoring framework for integrating many domain-specific and task-specific monitoring techniques and then using the concept of value of an alert to avoid operator overload. We use this framework to describe an execution monitoring approach we have used to implement Execution Assistants (EAs) in two different dynamic, data-rich, real-world domains to assist a human in monitoring team behavior. One domain (Army small unit operations) has hundreds of mobile, geographically distributed agents, a combination of humans, robots, and vehicles. The other domain (teams of unmanned ground and air vehicles) has a handful of cooperating robots. Both domains involve unpredictable adversaries in the vicinity. Our approach customizes monitoring behavior for each specific task, plan, and situation, as well as for user preferences. Our EAs alert the human controller when reported events threaten plan execution or physically threaten team members. Alerts were generated in a timely manner without inundating the user with too many alerts (less than 10 percent of alerts are unwanted, as judged by domain experts).
Expert-Guided Subgroup Discovery: Methodology and Application
This paper presents an approach to expert-guided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The other important steps of the proposed subgroup discovery process are the detection of statistically significant properties of selected subgroups and subgroup visualization: statistically significant properties are used to enrich the descriptions of induced subgroups, while the visualization shows subgroup properties in the form of distributions of the numbers of examples in the subgroups. The approach is illustrated by the results obtained for a medical problem of early detection of patient risk groups.
Acquiring Word-Meaning Mappings for Natural Language Interfaces
This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. WOLFIE is part of an integrated system that learns to transform sentences into representations such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by WOLFIE are compared to those acquired by a similar system, with results favorable to WOLFIE. A second set of experiments demonstrates WOLFIE's ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance.