Technology
Optimality of Reinforcement Learning Algorithms with Linear Function Approximation
There are several reinforcement learning algorithms that yield approximate solutionsfor the problem of policy evaluation when the value function is represented with a linear function approximator. In this paper we show that each of the solutions is optimal with respect to a specific objective function.
Bias-Optimal Incremental Problem Solving
Given is a problem sequence and a probability distribution (the bias) on programs computing solution candidates. We present an optimally fast way of incrementally solving each task in the sequence. Bias shifts are computed by program prefixes that modify the distribution on their suffixes byreusing successful code for previous tasks (stored in non-modifiable memory). No tested program gets more runtime than its probability times the total search time.
Minimax Differential Dynamic Programming: An Application to Robust Biped Walking
Morimoto, Jun, Atkeson, Christopher G.
We developed a robust control policy design method in high-dimensional state space by using differential dynamic programming with a minimax criterion. As an example, we applied our method to a simulated five link biped robot. The results show lower joint torques from the optimal control policycompared to a hand-tuned PD servo controller. Results also show that the simulated biped robot can successfully walk with unknown disturbances that cause controllers generated by standard differential dynamic programmingand the hand-tuned PD servo to fail. Learning to compensate for modeling error and previously unknown disturbances in conjunction with robust control design is also demonstrated.
Learning a Forward Model of a Reflex
Porr, Bernd, Wörgötter, Florentin
We develop a systems theoretical treatment of a behavioural system that interacts with its environment in a closed loop situation such that its motor actionsinfluence its sensor inputs. The simplest form of a feedback is a reflex. Reflexes occur always "too late"; i.e., only after a (unpleasant, painful,dangerous) reflex-eliciting sensor event has occurred. This defines an objective problem which can be solved if another sensor input exists which can predict the primary reflex and can generate an earlier reaction. In contrast to previous approaches, our linear learning algorithm allowsfor an analytical proof that this system learns to apply feedforward controlwith the result that slow feedback loops are replaced by their equivalent feed-forward controller creating a forward model. In other words, learning turns the reactive system into a proactive system. By means of a robot implementation we demonstrate the applicability of the theoretical results which can be used in a variety of different areas in physics and engineering.
A Probabilistic Model for Learning Concatenative Morphology
Snover, Matthew G., Brent, Michael R.
This paper describes a system for the unsupervised learning of morphological suffixesand stems from word lists. The system is composed of a generative probability model and hill-climbing and directed search algorithms. Byextracting and examining morphologically rich subsets of an input lexicon, the directed search identifies highly productive paradigms.
"Name That Song!" A Probabilistic Approach to Querying on Music and Text
Eric, Brochu, Freitas, Nando de
We present a novel, flexible statistical approach for modelling music and text jointly. The approach is based on multi-modal mixture models and maximum a posteriori estimation using EM. The learned models can be used to browse databases with documents containing music and text, to search for music using queries consisting of music and text (lyrics and other contextual information), to annotate text documents with music, and to automatically recommend or identify similar songs.
Learning to Classify Galaxy Shapes Using the EM Algorithm
Kirshner, Sergey, Cadez, Igor V., Smyth, Padhraic, Kamath, Chandrika
We describe the application of probabilistic model-based learning to the problem of automatically identifying classes of galaxies, based on both morphological and pixel intensity characteristics. The EM algorithm can be used to learn how to spatially orient a set of galaxies so that they are geometrically aligned. We augment this "ordering-model" with a mixture model on objects, and demonstrate how classes of galaxies can be learned in an unsupervised manner using a two-level EM algorithm. The resulting models provide highly accurate classi£cation of galaxies in cross-validation experiments.
Improving a Page Classifier with Anchor Extraction and Link Analysis
Most text categorization systems use simple models of documents and document collections. In this paper we describe a technique that improves asimple web page classifier's performance on pages from a new, unseen web site, by exploiting link structure within a site as well as page structure within hub pages. On real-world test cases, this technique significantly and substantially improves the accuracy of a bag-of-words classifier, reducing error rate by about half, on average. The system uses a variant of co-training to exploit unlabeled data from a new site. Pages are labeled using the base classifier; the results are used by a restricted wrapper-learner to propose potential "main-category anchor wrappers"; and finally, these wrappers are used as features by a third learner to find a categorization of the site that implies a simple hub structure, but which also largely agrees with the original bag-of-words classifier.
Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis
Vinokourov, Alexei, Cristianini, Nello, Shawe-Taylor, John
The problem of learning a semantic representation of a text document from data is addressed, in the situation where a corpus of unlabeled paired documents is available, each pair being formed by a short English documentand its French translation. This representation can then be used for any retrieval, categorization or clustering task, both in a standard andin a cross-lingual setting. By using kernel functions, in this case simple bag-of-words inner products, each part of the corpus is mapped to a high-dimensional space. The correlations between the two spaces are then learnt by using kernel Canonical Correlation Analysis. A set of directions is found in the first and in the second space that are maximally correlated.Since we assume the two representations are completely independentapart from the semantic content, any correlation between them should reflect some semantic similarity. Certain patterns of English words that relate to a specific meaning should correlate with certain patternsof French words corresponding to the same meaning, across the corpus. Using the semantic representation obtained in this way we first demonstrate that the correlations detected between the two versions of the corpus are significantly higher than random, and hence that a representation basedon such features does capture statistical patterns that should reflect semantic information. Then we use such representation both in cross-language and in single-language retrieval tasks, observing performance that is consistently and significantly superior to LSI on the same data.