Genre
Monte Carlo Methods for Tempo Tracking and Rhythm Quantization
The on tin uous hidden v ariables denote the temp o. Ex-a t omputation of p osterior features su h as the MAP state is in tra table in this mo del lass, so w e in tro du e Mon te Carlo metho ds for in tegration and optimization. The metho ds an b e applied in b oth online and bat h s enarios su h as temp o tra king and trans ription and are th us p oten tially useful in a n um b er of m usi appli ations su h as adaptiv e automati a ompanimen t, s ore t yp esetting and m usi information retriev al. 1. Ho w ev er, when op erating on sampled audio data from p olyphoni a ousti al signals, extra tion of a s ore-lik e des ription is a v ery hallenging auditory s ene analysis task (V er o e, Gardner, & S heirer, 1998). In this pap er, w e fo us on a subproblem in m usi -ir, where w e assume that exa t timing information of notes is a v ailable, for example as a stream of MIDI 1 ev en ts from a digital k eyb oard. One example is automati s ore t yp esetting, 1. Musi al Instrumen ts Digital In terfa e. Ea h time a k ey is pressed, a MIDI k eyb oard generates a short message on taining pit h and k ey v elo it y . In on v en tional m usi notation, the onset time of ea h note is impli itly represen ted b y the um ulativ e sum of durations of previous notes. Durations are en o ded b y simple rational n um b ers (e.g., quarter note, eigh th note), onsequen tly all ev en ts in m usi are pla ed on a dis rete grid. This is due to the fa t that m usi ians in tro du e in ten tional (and unin ten tional) deviations from a me hani al pres ription. F or example timing of ev en ts an b e delib erately dela y ed or pushed. Moreo v er, the temp o an u tuate b y slo wing do wn or a elerating. In fa t, su h deviations are natural asp e ts of expressiv e p erforman e; in the absen e of these, m usi tends to sound rather dull and me hani al. On the other hand, if these deviations are not a oun ted for during trans ription, resulting s ores ha v e often v ery p o or qualit y . Robust and fast quan tization and temp o tra king is also an imp ortan t requiremen t for in tera tiv e p erforman e systems; appli ations that \listen" to a p erformer for generating an a ompanimen t or impro visation in real time (Raphael, 2001b; Thom, 2000). A t last, su h mo dels are also useful in m usi ology for systemati study and hara terization of expressiv e timing b y prin ipled analysis of existing p erforman e data. F rom a theoreti al p ersp e tiv e, sim ultaneous quan tization and temp o tra king is a \ hi k en-and-egg" problem: the quan tization dep ends up on the in tended temp o in terpre-tation and the temp o in terpretation dep ends up on the quan tization. Apparen tly, h uman listeners an resolv e this am biguit y (in most ases) without an y eort.
Translation of Pronominal Anaphora between English and Spanish: Discrepancies and Evaluation
This paper evaluates the different tasks carried out in the translation of pronominal anaphora in a machine translation (MT) system. The MT interlingua approach named AGIR (Anaphora Generation with an Interlingua Representation) improves upon other proposals presented to date because it is able to translate intersentential anaphors, detect co-reference chains, and translate Spanish zero pronouns into English---issues hardly considered by other systems. The paper presents the resolution and evaluation of these anaphora problems in AGIR with the use of different kinds of knowledge (lexical, morphological, syntactic, and semantic). The translation of English and Spanish anaphoric third-person personal pronouns (including Spanish zero pronouns) into the target language has been evaluated on unrestricted corpora. We have obtained a precision of 80.4% and 84.8% in the translation of Spanish and English pronouns, respectively. Although we have only studied the Spanish and English languages, our approach can be easily extended to other languages such as Portuguese, Italian, or Japanese.
Inferring 3D Articulated Models for Box Packaging Robot
Yang, Heran, Low, Tiffany, Cong, Matthew, Saxena, Ashutosh
Given a point cloud, we consider inferring kinematic models of 3D articulated objects such as boxes for the purpose of manipulating them. While previous work has shown how to extract a planar kinematic model (often represented as a linear chain), such planar models do not apply to 3D objects that are composed of segments often linked to the other segments in cyclic configurations. We present an approach for building a model that captures the relation between the input point cloud features and the object segment as well as the relation between the neighboring object segments. We use a conditional random field that allows us to model the dependencies between different segments of the object. We test our approach on inferring the kinematic structure from partial and noisy point cloud data for a wide variety of boxes including cake boxes, pizza boxes, and cardboard cartons of several sizes. The inferred structure enables our robot to successfully close these boxes by manipulating the flaps.
Towards Adjustable Autonomy for the Real World
Pynadath, D. V., Scerri, P., Tambe, M.
Adjustable autonomy refers to entities dynamically varying their own autonomy, transferring decision-making control to other entities (typically agents transferring control to human users) in key situations. Determining whether and when such transfers-of-control should occur is arguably the fundamental research problem in adjustable autonomy. Previous work has investigated various approaches to addressing this problem but has often focused on individual agent-human interactions. Unfortunately, domains requiring collaboration between teams of agents and humans reveal two key shortcomings of these previous approaches. First, these approaches use rigid one-shot transfers of control that can result in unacceptable coordination failures in multiagent settings. Second, they ignore costs (e.g., in terms of time delays or effects on actions) to an agent's team due to such transfers-of-control. To remedy these problems, this article presents a novel approach to adjustable autonomy, based on the notion of a transfer-of-control strategy. A transfer-of-control strategy consists of a conditional sequence of two types of actions: (i) actions to transfer decision-making control (e.g., from an agent to a user or vice versa) and (ii) actions to change an agent's pre-specified coordination constraints with team members, aimed at minimizing miscoordination costs. The goal is for high-quality individual decisions to be made with minimal disruption to the coordination of the team. We present a mathematical model of transfer-of-control strategies. The model guides and informs the operationalization of the strategies using Markov Decision Processes, which select an optimal strategy, given an uncertain environment and costs to the individuals and teams. The approach has been carefully evaluated, including via its use in a real-world, deployed multi-agent system that assists a research group in its daily activities.
Natural Evolution Strategies
Wierstra, Daan, Schaul, Tom, Glasmachers, Tobias, Sun, Yi, Schmidhuber, Jürgen
This paper presents Natural Evolution Strategies (NES), a recent family of algorithms that constitute a more principled approach to black-box optimization than established evolutionary algorithms. NES maintains a parameterized distribution on the set of solution candidates, and the natural gradient is used to update the distribution's parameters in the direction of higher expected fitness. We introduce a collection of techniques that address issues of convergence, robustness, sample complexity, computational complexity and sensitivity to hyperparameters. This paper explores a number of implementations of the NES family, ranging from general-purpose multi-variate normal distributions to heavy-tailed and separable distributions tailored towards global optimization and search in high dimensional spaces, respectively. Experimental results show best published performance on various standard benchmarks, as well as competitive performance on others.
High-dimensional covariance estimation based on Gaussian graphical models
Zhou, Shuheng, Rutimann, Philipp, Xu, Min, Buhlmann, Peter
Undirected graphs are often used to describe high dimensional distributions. Under sparsity conditions, the graph can be estimated using $\ell_1$-penalization methods. We propose and study the following method. We combine a multiple regression approach with ideas of thresholding and refitting: first we infer a sparse undirected graphical model structure via thresholding of each among many $\ell_1$-norm penalized regression functions; we then estimate the covariance matrix and its inverse using the maximum likelihood estimator. We show that under suitable conditions, this approach yields consistent estimation in terms of graphical structure and fast convergence rates with respect to the operator and Frobenius norm for the covariance matrix and its inverse. We also derive an explicit bound for the Kullback Leibler divergence.
Gaussian Process Regression with a Student-t Likelihood
Jylänki, Pasi, Vanhatalo, Jarno, Vehtari, Aki
This paper considers the robust and efficient implementation of Gaussian process regression with a Student-t observation model. The challenge with the Student-t model is the analytically intractable inference which is why several approximative methods have been proposed. The expectation propagation (EP) has been found to be a very accurate method in many empirical studies but the convergence of the EP is known to be problematic with models containing non-log-concave site functions such as the Student-t distribution. In this paper we illustrate the situations where the standard EP fails to converge and review different modifications and alternative algorithms for improving the convergence. We demonstrate that convergence problems may occur during the type-II maximum a posteriori (MAP) estimation of the hyperparameters and show that the standard EP may not converge in the MAP values in some difficult cases. We present a robust implementation which relies primarily on parallel EP updates and utilizes a moment-matching-based double-loop algorithm with adaptively selected step size in difficult cases. The predictive performance of the EP is compared to the Laplace, variational Bayes, and Markov chain Monte Carlo approximations.
Machine Learning Markets
Prediction markets show considerable promise for developing flexible mechanisms for machine learning. Here, machine learning markets for multivariate systems are defined, and a utility-based framework is established for their analysis. This differs from the usual approach of defining static betting functions. It is shown that such markets can implement model combination methods used in machine learning, such as product of expert and mixture of expert approaches as equilibrium pricing models, by varying agent utility functions. They can also implement models composed of local potentials, and message passing methods. Prediction markets also allow for more flexible combinations, by combining multiple different utility functions. Conversely, the market mechanisms implement inference in the relevant probabilistic models. This means that market mechanism can be utilized for implementing parallelized model building and inference for probabilistic modelling.
Propositional Independence - Formula-Variable Independence and Forgetting
Lang, J., Liberatore, P., Marquis, P.
Independence -- the study of what is relevant to a given problem of reasoning -- has received an increasing attention from the AI community. In this paper, we consider two basic forms of independence, namely, a syntactic one and a semantic one. We show features and drawbacks of them. In particular, while the syntactic form of independence is computationally easy to check, there are cases in which things that intuitively are not relevant are not recognized as such. We also consider the problem of forgetting, i.e., distilling from a knowledge base only the part that is relevant to the set of queries constructed from a subset of the alphabet. While such process is computationally hard, it allows for a simplification of subsequent reasoning, and can thus be viewed as a form of compilation: once the relevant part of a knowledge base has been extracted, all reasoning tasks to be performed can be simplified.
Interactive Execution Monitoring of Agent Teams
Berry, P., Lee, T. J., Wilkins, D. E.
There is an increasing need for automated support for humans monitoring the activity of distributed teams of cooperating agents, both human and machine. We characterize the domain-independent challenges posed by this problem, and describe how properties of domains influence the challenges and their solutions. We will concentrate on dynamic, data-rich domains where humans are ultimately responsible for team behavior. Thus, the automated aid should interactively support effective and timely decision making by the human. We present a domain-independent categorization of the types of alerts a plan-based monitoring system might issue to a user, where each type generally requires different monitoring techniques. We describe a monitoring framework for integrating many domain-specific and task-specific monitoring techniques and then using the concept of value of an alert to avoid operator overload. We use this framework to describe an execution monitoring approach we have used to implement Execution Assistants (EAs) in two different dynamic, data-rich, real-world domains to assist a human in monitoring team behavior. One domain (Army small unit operations) has hundreds of mobile, geographically distributed agents, a combination of humans, robots, and vehicles. The other domain (teams of unmanned ground and air vehicles) has a handful of cooperating robots. Both domains involve unpredictable adversaries in the vicinity. Our approach customizes monitoring behavior for each specific task, plan, and situation, as well as for user preferences. Our EAs alert the human controller when reported events threaten plan execution or physically threaten team members. Alerts were generated in a timely manner without inundating the user with too many alerts (less than 10 percent of alerts are unwanted, as judged by domain experts).