Clustering
The Method of Quantum Clustering
We propose a novel clustering method that is an extension of ideas inherent toscale-space clustering and support-vector clustering. Like the latter, itassociates every data point with a vector in Hilbert space, and like the former it puts emphasis on their total sum, that is equal to the scalespace probabilityfunction. The novelty of our approach is the study of an operator in Hilbert space, represented by the Schrรถdinger equation of which the probability function is a solution. This Schrรถdinger equation contains a potential function that can be derived analytically from the probability function.
Information Self-Service with a Knowledge Base That Learns
Durbin, Stephen D., Warner, Doug, Richter, J. Neal, Gedeon, Zuzana
Delivering effective customer service over the internet requires attention to many aspects of knowledge management if it is to be both satisfying for customers and economical for the company or other organization. In RightNow ESERVICE CENTER, such management is built into the architecture and supported by automatically gathering metainformation about the documents held in the core knowledge base. A variety of AI techniques are used to facilitate the construction, maintenance, and navigation of the knowledge base. These techniques include collaborative filtering, swarm intelligence, fuzzy logic, natural language processing, text clustering, and classification rule learning. Customers using ESERVICE CENTER report dramatic decreases in support costs and increases in customer satisfaction because of the ease of use provided by the self-learning features of the knowledge base.
Data Clustering by Markovian Relaxation and the Information Bottleneck Method
We introduce a new, nonparametric and principled, distance based clustering method. This method combines a pairwise based approach with a vector-quantization method which provide a meaningful interpretation to the resulting clusters. The idea is based on turning the distance matrix into a Markov process and then examine the decay of mutual-information during the relaxation of this process. The clusters emerge as quasi-stable structures during this relaxation, and then are extracted using the information bottleneck method.
Data Clustering by Markovian Relaxation and the Information Bottleneck Method
We introduce a new, nonparametric and principled, distance based clustering method. This method combines a pairwise based approach with a vector-quantization method which provide a meaningful interpretation to the resulting clusters. The idea is based on turning the distance matrix into a Markov process and then examine the decay of mutual-information during the relaxation of this process. The clusters emerge as quasi-stable structures during this relaxation, and then are extracted using the information bottleneck method.
Data Clustering by Markovian Relaxation and the Information Bottleneck Method
We introduce a new, nonparametric and principled, distance based clustering method. This method combines a pairwise based approach witha vector-quantization method which provide a meaningful interpretation to the resulting clusters. The idea is based on turning the distance matrix into a Markov process and then examine the decay of mutual-information during the relaxation of this process. The clusters emerge as quasi-stable structures during thisrelaxation, and then are extracted using the information bottleneck method.
Generalized Model Selection for Unsupervised Learning in High Dimensions
Vaithyanathan, Shivakumar, Dom, Byron
We describe a Bayesian approach to model selection in unsupervised learning that determines both the feature set and the number of clusters. We then evaluate this scheme (based on marginal likelihood) and one based on cross-validated likelihood. For the Bayesian scheme we derive a closed-form solution of the marginal likelihood by assuming appropriate forms of the likelihood function and prior. Extensive experiments compare these approaches and all results are verified by comparison against ground truth. In these experiments the Bayesian scheme using our objective function gave better results than cross-validation. 1 Introduction Recent efforts define the model selection problem as one of estimating the number of clusters[ 10, 17].
A MCMC Approach to Hierarchical Mixture Modelling
There are many hierarchical clustering algorithms available, but these lack a firm statistical basis. Here we set up a hierarchical probabilistic mixture model, where data is generated in a hierarchical tree-structured manner. Markov chain Monte Carlo (MCMC) methods are demonstrated which can be used to sample from the posterior distribution over trees containing variable numbers of hidden units.
Generalized Model Selection for Unsupervised Learning in High Dimensions
Vaithyanathan, Shivakumar, Dom, Byron
We describe a Bayesian approach to model selection in unsupervised learning that determines both the feature set and the number of clusters. We then evaluate this scheme (based on marginal likelihood) and one based on cross-validated likelihood. For the Bayesian scheme we derive a closed-form solution of the marginal likelihood by assuming appropriate forms of the likelihood function and prior. Extensive experiments compare these approaches and all results are verified by comparison against ground truth. In these experiments the Bayesian scheme using our objective function gave better results than cross-validation. 1 Introduction Recent efforts define the model selection problem as one of estimating the number of clusters[ 10, 17].
A MCMC Approach to Hierarchical Mixture Modelling
There are many hierarchical clustering algorithms available, but these lack a firm statistical basis. Here we set up a hierarchical probabilistic mixture model, where data is generated in a hierarchical tree-structured manner. Markov chain Monte Carlo (MCMC) methods are demonstrated which can be used to sample from the posterior distribution over trees containing variable numbers of hidden units.