Technology
Learning Lateral Interactions for Feature Binding and Sensory Segmentation
We present a new approach to the supervised learning of lateral interactions forthe competitive layer model (CLM) dynamic feature binding architecture. The method is based on consistency conditions, which were recently shown to characterize the attractor states of this linear threshold recurrent network. For a given set of training examples the learning problem isformulated as a convex quadratic optimization problem in the lateral interaction weights. An efficient dimension reduction of the learning problem can be achieved by using a linear superposition of basis interactions.
Causal Categorization with Bayes Nets
A theory of categorization is presented in which knowledge of causal relationships between category features is represented as a Bayesian network. Referred to as causal-model theory, this theory predicts that objects are classified as category members to the extent they are likely to have been produced by a categorys causal model. On this view, people have models of the world that lead them to expect a certain distribution of features in category members (e.g., correlations between feature pairs that are directly connected by causal relationships), and consider exemplars good category members when they manifest those expectations. These expectations include sensitivity to higher-order feature interactions that emerge from the asymmetries inherent in causal relationships. Research on the topic of categorization has traditionally focused on the problem of learning new categories given observations of category members.
Pranking with Ranking
We discuss the problem of ranking instances. In our framework each instance is associated with a rank or a rating, which is an integer from 1 to k. Our goal is to find a rank-prediction rule that assigns each instance a rank which is as close as possible to the instance's true rank. We describe a simple and efficient online algorithm, analyzeits performance in the mistake bound model, and prove its correctness. We describe two sets of experiments, with synthetic data and with the EachMovie dataset for collaborative filtering. In the experiments we performed, our algorithm outperforms onlinealgorithms for regression and classification applied to ranking. 1 Introduction The ranking problem we discuss in this paper shares common properties with both classification and regression problems. As in classification problems the goal is to assign one of k possible labels to a new instance. Similar to regression problems, the set of k labels is structured as there is a total order relation between the labels. We refer to the labels as ranks and without loss of generality assume that the ranks constitute the set {I, 2, .. .
Global Coordination of Local Linear Models
Roweis, Sam T., Saul, Lawrence K., Hinton, Geoffrey E.
High dimensional data that lies on or near a low dimensional manifold can be described bya collection of local linear models. Such a description, however, does not provide a global parameterization of the manifold--arguably an important goal of unsupervised learning. In this paper, we show how to learn a collection of local linear models that solves this more difficult problem. Our local linear models are represented by a mixture of factor analyzers, and the "global coordination" ofthese models is achieved by adding a regularizing term to the standard maximum likelihood objective function. The regularizer breaks a degeneracy in the mixture model's parameter space, favoring models whose internal coordinate systemsare aligned in a consistent way. As a result, the internal coordinates changesmoothly and continuously as one traverses a connected path on the manifold--even when the path crosses the domains of many different local models. The regularizer takes the form of a Kullback-Leibler divergence and illustrates an unexpected application of variational methods: not to perform approximate inferencein intractable probabilistic models, but to learn more useful internal representations in tractable ones.
Information-Geometrical Significance of Sparsity in Gallager Codes
Tanaka, Toshiyuki, Ikeda, Shiro, Amari, Shun-ichi
We report a result of perturbation analysis on decoding error of the belief propagation decoder for Gallager codes. The analysis is based on information geometry,and it shows that the principal term of decoding error at equilibrium comes from the m-embedding curvature of the log-linear submanifold spanned by the estimated pseudoposteriors, one for the full marginal, and K for partial posteriors, each of which takes a single check into account, where K is the number of checks in the Gallager code. It is then shown that the principal error term vanishes when the parity-check matrix of the code is so sparse that there are no two columns with overlap greater than 1.