Separating Style and Content
Tenenbaum, Joshua B., Freeman, William T.
We seek to analyze and manipulate two factors, which we call style and content, underlying a set of observations. We fit training data with bilinear models which explicitly represent the two-factor structure. These models can adapt easily during testing to new styles or content, allowing us to solve three general tasks: extrapolation of a new style to unobserved content; classification of content observed in a new style; and translation of new content observed in a new style.
Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning
Figure 2: The task is to move the cart to the origin as quickly as possible without dropping the pole. The bottom three pictures show a trace of the policy execution obtained after one, two, and three trials (shown in increments of 0.5 seconds) Controller Number of data points used Cost from initial state 17 to build the controller LQR
Competition Among Networks Improves Committee Performance
Munro, Paul W., Parmanto, Bambang
Since a neural network predictor inherently has an excessive number of parameters, reducing the prediction error is usually done by reducing variance. Methods for reducing neural network complexity can be viewed as a regularization technique to reduce this variance. Examples of such methods are Optimal Brain Damage (Le Cun et.
Local Bandit Approximation for Optimal Learning Problems
Duff, Michael O., Barto, Andrew G.
A Bayesian formulation of the problem leads to a clear concept of a solution whose computation, however, appears to entail an examination of an intractably-large number of hyperstates. This paper has suggested extending the Gittins index approach (which applies with great power and elegance to the special class of multi-armed bandit processes) to general adaptive MDP's. The hope has been that if certain salient features of the value of information could be captured, even approximately, then one could be led to a reasonable method for avoiding certain defects of certainty-equivalence approaches (problems with identifiability, "metastability"). Obviously, positive evidence, in the form of empirical results from simulation experiments, would lend support to these ideas-work along these lines is underway. Local bandit approximation is but one approximate computational approach for problems of optimal learning and dual control. Most prominent in the literature of control theory is the "wide-sense" approach of [Bar-Shalom & Tse, 1976], which utilizes local quadratic approximations about nominal state/control trajectories. For certain problems, this method has demonstrated superior performance compared to a certainty-equivalence approach, but it is computationally very intensive and unwieldy, particularly for problems with controller dimension greater than one. One could revert to the view of the bandit problem, or general adaptive MDP, as simply a very large MDP defined over hyperstates, and then consider a some- Local Bandit Approximationfor Optimal Learning Problems 1025 what direct approach in which one performs approximate dynamic programming with function approximation over this domain-details of function-approximation, feature-selection, and "training" all become important design issues.
Early Brain Damage
Tresp, Volker, Neuneier, Ralph, Zimmermann, Hans-Georg
Optimal Brain Damage (OBD) is a method for reducing the number of weights in a neural network. OBD estimates the increase in cost function if weights are pruned and is a valid approximation if the learning algorithm has converged into a local minimum. On the other hand it is often desirable to terminate the learning process before a local minimum is reached (early stopping). In this paper we show that OBD estimates the increase in cost function incorrectly if the network is not in a local minimum. We also show how OBD can be extended such that it can be used in connection with early stopping. We call this new approach Early Brain Damage, EBD. EBD also allows to revive already pruned weights. We demonstrate the improvements achieved by EBD using three publicly available data sets.
Why did TD-Gammon Work?
Pollack, Jordan B., Blair, Alan D.
Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference We werelearning for other applications or even other games. Instead we apply simple hill-climbing in a relative fitness environment. These results and further analysis suggest of Tesauro's program had more to do with thethat the surprising success of the learning task and the dynamics of theco-evolutionary structure backgammon game itself. 1 INTRODUCTION It took great chutzpah for Gerald Tesauro to start wasting computer cycles on temporal of Backgammon (Tesauro, 1992). After all, the dream ofprogram play itself in the hopes computers mastering a domain by self-play or "introspection" had been around since the early days of AI, forming part of Samuel's checker player (Samuel, 1959) and used in Donald Michie's MENACE tictac-toe learner (Michie, 1961). However such self-conditioning or nonexistent internal representations, had generally beensystems, with weak of scale and abandoned by the field of AI.
Promoting Poor Features to Supervisors: Some Inputs Work Better as Outputs
Caruana, Rich, Sa, Virginia R. de
In supervised learning there is usually a clear distinction between inputs and outputs - inputs are what you will measure, outputs are what you will predict from those measurements. This paper shows that the distinction between inputs and outputs is not this Some features are more useful as extra outputs than assimple. By using a feature as an output we get more than just the case values but can. For many features this mapping may be more useful than the feature value itself. We present two regression problems and one classification problem where performance improves if features that could have been used as inputs are used as extra outputs instead.
Analog VLSI Circuits for Attention-Based, Visual Tracking
Horiuchi, Timothy K., Morris, Tonia G., Koch, Christof, DeWeerth, Stephen P.
A one-dimensional visual tracking chip has been implemented using neuromorphic,analog VLSI techniques to model selective visual attention in the control of saccadic and smooth pursuit eye movements. Thechip incorporates focal-plane processing to compute image saliency and a winner-take-all circuit to select a feature for tracking. The target position and direction of motion are reported as the target moves across the array. We demonstrate its functionality ina closed-loop system which performs saccadic and smooth pursuit tracking movements using a one-dimensional mechanical eye. 1 Introduction Tracking a moving object on a cluttered background is a difficult task. When more than one target is in the field of view, a decision must be made to determine which target to track and what its movement characteristics are.
Text-Based Information Retrieval Using Exponentiated Gradient Descent
Papka, Ron, Callan, James P., Barto, Andrew G.
The following investigates the use of single-neuron learning algorithms to improve the performance of text-retrieval systems that accept natural-language queries. A retrieval process is explained that transforms the natural-language query into the query syntax of a real retrieval system: the initial query is expanded using statistical and learning techniques and is then used for document ranking and binary classification. The results of experiments suggest that Kivinen and Warmuth's Exponentiated Gradient Descent learning algorithm works significantly better than previous approaches. 1 Introduction The following work explores two learning algorithms - Least Mean Squared (LMS) [1] and Exponentiated Gradient Descent (EG) [2] - in the context of text-based Information Retrieval (IR) systems. The experiments presented in [3] use connectionist to improve the retrieval of relevant documents from a largelearning models collection of text. Previous the area employs various techniques for improving retrieval [6, 7, 14].