Goto

Collaborating Authors

 Clustering


Multidimensional Scaling and Data Clustering

Neural Information Processing Systems

Visualizing and structuring pairwise dissimilarity data are difficult combinatorial optimization problems known as multidimensional scaling or pairwise data clustering. Algorithms for embedding dissimilarity data set in a Euclidian space, for clustering these data and for actively selecting data to support the clustering process are discussed in the maximum entropy framework. Active data selection provides a strategy to discover structure in a data set efficiently with partially unknown data. 1 Introduction Grouping experimental data into compact clusters arises as a data analysis problem in psychology, linguistics, genetics and other experimental sciences. The data which are supposed to be clustered are either given by an explicit coordinate representation (central clustering) or, in the non-metric case, they are characterized by dissimilarity values for pairs of data points (pairwise clustering). In this paper we study algorithms (i) for embedding non-metric data in a D-dimensional Euclidian space, (ii) for simultaneous clustering and embedding of non-metric data, and (iii) for active data selection to determine a particular cluster structure with minimal number of data queries. All algorithms are derived from the maximum entropy principle (Hertz et al., 1991) which guarantees robust statistics (Tikochinsky et al., 1984).


Factorial Learning by Clustering Features

Neural Information Processing Systems

We introduce a novel algorithm for factorial learning, motivated by segmentation problems in computational vision, in which the underlying factors correspond to clusters of highly correlated input features. The algorithm derives from a new kind of competitive clustering model, in which the cluster generators compete to explain eachfeature of the data set and cooperate to explain each input example, rather than competing for examples and cooperating onfeatures, as in traditional clustering algorithms. A natural extension of the algorithm recovers hierarchical models of data generated from multiple unknown categories, each with a different, multiplecausal structure. Several simulations demonstrate the power of this approach.


Multidimensional Scaling and Data Clustering

Neural Information Processing Systems

Visualizing and structuring pairwise dissimilarity data are difficult combinatorial optimization problemsknown as multidimensional scaling or pairwise data clustering. Algorithms for embedding dissimilarity data set in a Euclidian space, for clustering these data and for actively selecting data to support the clustering process are discussed in the maximum entropy framework. Active data selection provides a strategy to discover structure in a data set efficiently with partially unknown data. 1 Introduction Grouping experimental data into compact clusters arises as a data analysis problem in psychology, linguistics,genetics and other experimental sciences. The data which are supposed to be clustered are either given by an explicit coordinate representation (central clustering) or, in the non-metric case, they are characterized by dissimilarity values for pairs of data points (pairwise clustering). In this paper we study algorithms (i) for embedding non-metric data in a D-dimensional Euclidian space, (ii) for simultaneous clustering and embedding of non-metric data, and (iii) for active data selection to determine a particular cluster structure with minimal number of data queries. All algorithms are derived from the maximum entropy principle (Hertz et al., 1991) which guarantees robust statistics (Tikochinsky et al., 1984).


Clustering with a Domain-Specific Distance Measure

Neural Information Processing Systems

The distance measure and learning problem are formally described as nested objective functions. We derive an efficient algorithm by using optimization techniques that allow us to divide up the objective function into parts which may be minimized in distinct phases. The algorithm has accurately recreated 10 prototypes from a randomly generated sample database of 100 images consisting of 20 points each in 120 experiments. Finally, by incorporating permutation invariance in our distance measure, we have a technique that we may be able to apply to the clustering of graphs. Our goal is to develop measures which will enable the learning of objects with shape or structure. Acknowledgements This work has been supported by AFOSR grant F49620-92-J-0465 and ONR/DARPA grant N00014-92-J-4048.


Classifying Hand Gestures with a View-Based Distributed Representation

Neural Information Processing Systems

We present a method for learning, tracking, and recognizing human hand gestures recorded by a conventional CCD camera without any special gloves or other sensors. A view-based representation is used to model aspects of the hand relevant to the trained gestures, and is found using an unsupervised clustering technique. We use normalized correlation networks, with dynamic time warping in the temporal domain, as a distance function for unsupervised clustering. Views are computed separably for space and time dimensions; the distributed response of the combination of these units characterizes the input data with a low dimensional representation. A supervised classification stage uses labeled outputs of the spatiotemporal units as training data. Our system can correctly classify gestures in real time with a low-cost image processing accelerator.


Central and Pairwise Data Clustering by Competitive Neural Networks

Neural Information Processing Systems

Data clustering amounts to a combinatorial optimization problem to reduce the complexity of a data representation and to increase its precision. Central and pairwise data clustering are studied in the maximum entropy framework. For central clustering we derive a set of reestimation equations and a minimization procedure which yields an optimal number of clusters, their centers and their cluster probabilities. A meanfield approximation for pairwise clustering is used to estimate assignment probabilities. A se1fconsistent solution to multidimensional scaling and pairwise clustering is derived which yields an optimal embedding and clustering of data points in a d-dimensional Euclidian space. 1 Introduction A central problem in information processing is the reduction of the data complexity with minimal loss in precision to discard noise and to reveal basic structure of data sets. Data clustering addresses this tradeoff by optimizing a cost function which preserves the original data as complete as possible and which simultaneously favors prototypes with minimal complexity (Linde et aI., 1980; Gray, 1984; Chou et aI., 1989; Rose et ai., 1990). We discuss an objective function for the joint optimization of distortion errors and the complexity of a reduced data representation. A maximum entropy estimation of the cluster assignments yields a unifying framework for clustering algorithms with a number of different distortion and complexity measures. The close analogy of complexity optimized clustering with winner-take-all neural networks suggests a neural-like implementation resembling topological feature maps (see Figure 1).


Classifying Hand Gestures with a View-Based Distributed Representation

Neural Information Processing Systems

We present a method for learning, tracking, and recognizing human hand gestures recorded by a conventional CCD camera without any special gloves or other sensors. A view-based representation is used to model aspects of the hand relevant to the trained gestures, and is found using an unsupervised clustering technique. We use normalized correlation networks, with dynamic time warping in the temporal domain, as a distance function for unsupervised clustering. Views are computed separably for space and time dimensions; the distributed response of the combination of these units characterizes the input data with a low dimensional representation. A supervised classification stage uses labeled outputs of the spatiotemporal units as training data. Our system can correctly classify gestures in real time with a low-cost image processing accelerator.


Central and Pairwise Data Clustering by Competitive Neural Networks

Neural Information Processing Systems

Data clustering amounts to a combinatorial optimization problem to reduce the complexity of a data representation and to increase its precision. Central and pairwise data clustering are studied in the maximum entropy framework. For central clustering we derive a set of reestimation equations and a minimization procedure which yields an optimal number of clusters, their centers and their cluster probabilities. A meanfield approximation for pairwise clustering is used to estimate assignment probabilities. A se1fconsistent solution to multidimensional scaling and pairwise clustering is derived which yields an optimal embedding and clustering of data points in a d-dimensional Euclidian space. 1 Introduction A central problem in information processing is the reduction of the data complexity with minimal loss in precision to discard noise and to reveal basic structure of data sets. Data clustering addresses this tradeoff by optimizing a cost function which preserves the original data as complete as possible and which simultaneously favors prototypes with minimal complexity (Linde et aI., 1980; Gray, 1984; Chou et aI., 1989; Rose et ai., 1990). We discuss an objective function for the joint optimization of distortion errors and the complexity of a reduced data representation. A maximum entropy estimation of the cluster assignments yields a unifying framework for clustering algorithms with a number of different distortion and complexity measures. The close analogy of complexity optimized clustering with winner-take-all neural networks suggests a neural-like implementation resembling topological feature maps (see Figure 1).


Clustering with a Domain-Specific Distance Measure

Neural Information Processing Systems

The distance measure and learning problem are formally described as nested objective functions. We derive an efficient algorithm by using optimization techniques that allow us to divide up the objective function into parts which may be minimized in distinct phases. The algorithm has accurately recreated 10 prototypes from a randomly generated sample database of 100 images consisting of 20 points each in 120 experiments. Finally, by incorporating permutation invariance in our distance measure, we have a technique that we may be able to apply to the clustering of graphs. Our goal is to develop measures which will enable the learning of objects with shape or structure. Acknowledgements This work has been supported by AFOSR grant F49620-92-J-0465 and ONR/DARPA grant N00014-92-J-4048.


Classifying Hand Gestures with a View-Based Distributed Representation

Neural Information Processing Systems

We present a method for learning, tracking, and recognizing human hand gestures recorded by a conventional CCD camera without any special gloves or other sensors. A view-based representation is used to model aspects of the hand relevant to the trained gestures, and is found using an unsupervised clustering technique. We use normalized correlation networks, withdynamic time warping in the temporal domain, as a distance function for unsupervised clustering. Views are computed separably for space and time dimensions; the distributed response of the combination of these units characterizes the input data with a low dimensional representation. Asupervised classification stage uses labeled outputs of the spatiotemporal units as training data. Our system can correctly classify gestures in real time with a low-cost image processing accelerator.