Neural Information Processing Systems
Structure Driven Image Database Retrieval
Bonet, Jeremy S. De, Viola, Paul A.
A new algorithm is presented which approximates the perceived visual similarity between images. The images are initially transformed intoa feature space which captures visual structure, texture and color using a tree of filters. Similarity is the inverse of the distance in this perceptual feature space. Using this algorithm we have constructed an image database system which can perform example based retrieval on large image databases. Using carefully constructed target sets, which limit variation to only a single visual characteristic, retrieval rates are quantitatively compared to those of standard methods. 1 Introduction Without supplementary information, there exists no way to directly measure the similarity between the content of images.
Asymptotic Theory for Regularization: One-Dimensional Linear Case
The generalization ability of a neural network can sometimes be improved dramatically by regularization. To analyze the improvement oneneeds more refined results than the asymptotic distribution ofthe weight vector. Here we study the simple case of one-dimensional linear regression under quadratic regularization, i.e., ridge regression. We study the random design, misspecified case, where we derive expansions for the optimal regularization parameter andthe ensuing improvement. It is possible to construct examples where it is best to use no regularization.
Experiences with Bayesian Learning in a Real World Application
Sykacek, Peter, Dorffner, Georg, Rappelsberger, Peter, Zeitlhofer, Josef
Sleep staging is usually based on rules defined by Rechtschaffen and Kales (see [8]). Rechtschaffen and Kales rules define 4 sleep stages, stage one to four, as well as rapid eye movement (REM) and wakefulness. In [1] J. Bentrup and S. Ray report that every year nearly one million US citizens consulted their physicians concerning their sleep. Since sleep staging is a tedious task (one all night recording on average takes abou t 3 hours to score manually), much effort was spent in designing automatic sleep stagers. Sleep staging is a classification problem which was solved using classical statistical t.echniques or techniques emerged from the field of artificial intelligence (AI) .
Unsupervised On-line Learning of Decision Trees for Hierarchical Data Analysis
Held, Marcus, Buhmann, Joachim M.
An adaptive online algorithm is proposed to estimate hierarchical data structures for non-stationary data sources. The approach is based on the principle of minimum cross entropy to derive a decision tree for data clustering and it employs a metalearning idea (learning to learn) to adapt to changes in data characteristics. Its efficiency is demonstrated by grouping non-stationary artifical data and by hierarchical segmentation of LANDSAT images. 1 Introduction Unsupervised learning addresses the problem to detect structure inherent in unlabeled andunclassified data. N. The encoding usually is represented by an assignment matrix M (Mia), where Mia 1 if and only if Xi belongs to cluster L: 1 MiaV (Xi, Ya) measures the quality of a data partition, Le., optimal assignments and prototypes (M,y)OPt argminM,y1i (M,Y) minimize the inhomogeneity of clusters w.r.t. a given distance measure V. For reasons of simplicity we restrict the presentation to the ' sum-of-squared-error criterion V(x, y) To facilitate this minimization a deterministic annealing approach was proposed in [5] which maps the discrete optimization problem, i.e. how to determine the data assignments, viathe Maximum Entropy Principle [2] to a continuous parameter es- Unsupervised Online Learning ofDecision Trees for Data Analysis 515 timation problem.
Learning to Order Things
Cohen, William W., Schapire, Robert E., Singer, Yoram
Most previous work in inductive learning has concentrated on learning to classify. However, there are many applications in which it is desirable to order rather than classify instances. An example might be a personalized email filter that gives a priority ordering to unread mail. Here we will consider the problem of learning how to construct such orderings, given feedback in the form of preference judgments, i.e., statements that one instance should be ranked ahead of another. Such orderings could be constructed based on a learned classifier or regression model, and in fact often are.
Detection of First and Second Order Motion
Grunewald, Alexander, Neumann, Heiko
A model of motion detection is presented. The model contains three stages. The first stage is unoriented and is selective for contrast polarities.The next two stages work in parallel. A phase insensitive stage pools across different contrast polarities through a spatiotemporal filter and thus can detect first and second order motion. A phase sensitive stage keeps contrast polarities separate, each of which is filtered through a spatiotemporal filter, and thus only first order motion can be detected.
Nonparametric Model-Based Reinforcement Learning
This paper describes some of the interactions of model learning algorithms and planning algorithms we have found in exploring model-based reinforcement learning. The paper focuses on how local trajectoryoptimizers can be used effectively with learned nonparametric models.We find that trajectory planners that are fully consistent with the learned model often have difficulty finding reasonable plansin the early stages of learning. Trajectory planners that balance obeying the learned model with minimizing cost (or maximizing reward) often do better, even if the plan is not fully consistent with the learned model. 1 INTRODUCTION We are exploring the use of nonparametric models in robot learning (Atkeson et al., 1997b; Atkeson and Schaal, 1997). This paper describes the interaction of model learning algorithms and planning algorithms, focusing on how local trajectory optimization canbe used effectively with nonparametric models in reinforcement learning. We find that trajectory optimizers that are fully consistent with the learned model often have difficulty finding reasonable plans in the early stages of learning. The message of this paper is that a planner should not be entirely consistent with the learned model during model-based reinforcement learning.
Learning Continuous Attractors in Recurrent Networks
One approach to invariant object recognition employs a recurrent neural networkas an associative memory. In the standard depiction of the network's state space, memories of objects are stored as attractive fixed points of the dynamics. I argue for a modification of this picture: if an object has a continuous family of instantiations, it should be represented by a continuous attractor. This idea is illustrated with a network that learns to complete patterns. To perform the task of filling in missing information, thenetwork develops a continuous attractor that models the manifold from which the patterns are drawn.
Agnostic Classification of Markovian Sequences
El-Yaniv, Ran, Fine, Shai, Tishby, Naftali
Classification of finite sequences without explicit knowledge of their statistical nature is a fundamental problem with many important applications. We propose a new information theoretic approach to this problem which is based on the following ingredients: (i) sequences aresimilar when they are likely to be generated by the same source; (ii) cross entropies can be estimated via "universal compression"; (iii)Markovian sequences can be asymptotically-optimally merged. With these ingredients we design a method for the classification of discrete sequences whenever they can be compressed. We introduce the method and illustrate its application for hierarchical clustering of languages and for estimating similarities of protein sequences.
Function Approximation with the Sweeping Hinge Algorithm
Hush, Don R., Lozano, Fernando, Horne, Bill G.
We present a computationally efficient algorithm for function approximation withpiecewise linear sigmoidal nodes. A one hidden layer network is constructed one node at a time using the method of fitting the residual. The task of fitting individual nodes is accomplished usinga new algorithm that searchs for the best fit by solving a sequence of Quadratic Programming problems. This approach offers significantadvantages over derivative-based search algorithms (e.g.