AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

ABC-LogitBoost for Multi-class Classification

arXiv.org Artificial IntelligenceAug-28-2009

We develop abc-logitboost, based on the prior work on abc-boost and robust logitboost. Our extensive experiments on a variety of datasets demonstrate the considerable improvement of abc-logitboost over logitboost and abc-mart.

artificial intelligence, machine learning, mis-classification error, (16 more...)

arXiv.org Artificial Intelligence

0908.4144

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies

Blei, David M., Griffiths, Thomas L., Jordan, Michael I.

arXiv.org Machine LearningAug-27-2009

We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scientific abstracts from several journals. This model exemplifies a recent trend in statistical machine learning--the use of Bayesian nonparametric methods to infer distributions on flexible data structures.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

0710.0845

Country: North America > United States > California (0.46)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Consumer Products & Services > Restaurants (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Kronecker Graphs: An Approach to Modeling Networks

Leskovec, Jure, Chakrabarti, Deepayan, Kleinberg, Jon, Faloutsos, Christos, Ghahramani, Zoubin

arXiv.org Machine LearningAug-21-2009

How can we model networks with a mathematically tractable model that allows for rigorous analysis of network properties? Networks exhibit a long list of surprising properties: heavy tails for the degree distribution; small diameters; and densification and shrinking diameters over time. Most present network models either fail to match several of the above properties, are complicated to analyze mathematically, or both. In this paper we propose a generative model for networks that is both mathematically tractable and can generate networks that have the above mentioned properties. Our main idea is to use the Kronecker product to generate graphs that we refer to as "Kronecker graphs". First, we prove that Kronecker graphs naturally obey common network properties. We also provide empirical evidence showing that Kronecker graphs can effectively model the structure of real networks. We then present KronFit, a fast and scalable algorithm for fitting the Kronecker graph generation model to large real networks. A naive approach to fitting would take super- exponential time. In contrast, KronFit takes linear time, by exploiting the structure of Kronecker matrix multiplication and by using statistical simulation techniques. Experiments on large real and synthetic networks show that KronFit finds accurate parameters that indeed very well mimic the properties of target networks. Once fitted, the model parameters can be used to gain insights about the network structure, and the resulting synthetic graphs can be used for null- models, anonymization, extrapolations, and graph summarization.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

0812.4905

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

A hierarchical Dirichlet process mixture model for haplotype reconstruction from multi-population data

Sohn, Kyung-Ah, Xing, Eric P.

arXiv.org Machine LearningAug-20-2009

The perennial problem of "how many clusters?" remains an issue of substantial interest in data mining and machine learning communities, and becomes particularly salient in large data sets such as populational genomic data where the number of clusters needs to be relatively large and open-ended. This problem gets further complicated in a co-clustering scenario in which one needs to solve multiple clustering problems simultaneously because of the presence of common centroids (e.g., ancestors) shared by clusters (e.g., possible descents from a certain ancestor) from different multiple-cluster samples (e.g., different human subpopulations). In this paper we present a hierarchical nonparametric Bayesian model to address this problem in the context of multi-population haplotype inference. Uncovering the haplotypes of single nucleotide polymorphisms is essential for many biological and medical applications. While it is uncommon for the genotype data to be pooled from multiple ethnically distinct populations, few existing programs have explicitly leveraged the individual ethnic information for haplotype inference. In this paper we present a new haplotype inference program, Haploi, which makes use of such information and is readily applicable to genotype sequences with thousands of SNPs from heterogeneous populations, with competent and sometimes superior speed and accuracy comparing to the state-of-the-art programs. Underlying Haploi is a new haplotype distribution model based on a nonparametric Bayesian formalism known as the hierarchical Dirichlet process, which represents a tractable surrogate to the coalescent process. The proposed model is exchangeable, unbounded, and capable of coupling demographic information of different populations.

artificial intelligence, haplotype, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/08-AOAS225

0812.4648

Country: North America > United States > Pennsylvania (0.28)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

High-dimensional variable selection

Wasserman, Larry, Roeder, Kathryn

arXiv.org Machine LearningAug-20-2009

This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression methods. In the first stage we fit a set of candidate models. In the second stage we select one model by cross-validation. In the third stage we use hypothesis testing to eliminate some variables. We refer to the first two stages as "screening" and the last stage as "cleaning." We consider three screening methods: the lasso, marginal regression, and forward stepwise regression. Our method gives consistent variable selection under certain conditions.

artificial intelligence, assumption, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/08-AOS646

0704.1139

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Dynamic quantum clustering: a method for visual exploration of structures in data

Weinstein, Marvin, Horn, David

arXiv.org Machine LearningAug-18-2009

A given set of data-points in some feature space may be associated with a Schrodinger equation whose potential is determined by the data. This is known to lead to good clustering solutions. Here we extend this approach into a full-fledged dynamical scheme using a time-dependent Schrodinger equation. Moreover, we approximate this Hamiltonian formalism by a truncated calculation within a set of Gaussian wave functions (coherent states) centered around the original points. This allows for analytic evaluation of the time evolution of all such states, opening up the possibility of exploration of relationships among data-points through observation of varying dynamical-distances among points and convergence of points into clusters. This formalism may be further supplemented by preprocessing, such as dimensional reduction through singular value decomposition or feature filtering.

dqc evolution, evolution, svd-entropy, (14 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.80.066117

0908.2644

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Classification by Set Cover: The Prototype Vector Machine

Bien, Jacob, Tibshirani, Robert

arXiv.org Machine LearningAug-17-2009

We introduce a new nearest-prototype classifier, the prototype vector machine (PVM). It arises from a combinatorial optimization problem which we cast as a variant of the set cover problem. We propose two algorithms for approximating its solution. The PVM selects a relatively small number of representative points which can then be used for classification. It contains 1-NN as a special case. The method is compatible with any dissimilarity measure, making it amenable to situations in which the data are not embedded in an underlying feature space or in which using a non-Euclidean metric is desirable. Indeed, we demonstrate on the much studied ZIP code data how the PVM can reap the benefits of a problem-specific metric. In this example, the PVM outperforms the highly successful 1-NN with tangent distance, and does so retaining fewer than half of the data points. This example highlights the strengths of the PVM in yielding a low-error, highly interpretable model. Additionally, we apply the PVM to a protein classification problem in which a kernel-based distance is used.

artificial intelligence, machine learning, prototype, (16 more...)

arXiv.org Machine Learning

0908.2284

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Discrete Temporal Models of Social Networks

Hanneke, Steve, Fu, Wenjie, Xing, Eric

arXiv.org Machine LearningAug-9-2009

The field of social network analysis is concerned with populations of actors, interconnected by a set of relations (e.g., friendship, communication, etc.). These relationships can be concisely described by directed graphs, with one vertex for each actor and an edge for each relation between a pair of actors. This network representation of a population can provide insight into organizational structures, social behavior patterns, emergence of global structure from local dynamics, and a variety of other social phenomena. There has been increasing demand for flexible statistical models of social networks, for the purposes of scientific exploration and as a basis for practical analysis and data mining tools. The subject of modeling a static social network has been investigated in some depth. For time-invariant networks, represented as a single directed or undirected graph, a number of flexible statistical models have been proposed, including the classic Exponential Random Graph Models (ERGM) and extensions (Frank and Strauss, 1986; Wasserman and Robins, 2005; Snijders, 2002; Robins and Pattison, 2005), which are descriptive in nature, latent space models that aim towards clustering and community discovery (Handcock and Raftery, 2007), and mixed-membership block models for role discovery (Airoldi et al., 2008). Of particular relevance to this paper is the ERGM, which is particularly flexible in that it can be customized to capture a wide range of signature connectivity patterns in the network via user-specified functions representing their sufficient statistics. Specifically, if N is some representation of a social network, and N is the set of all possible networks in this representation, then the probability distribution function for any ERGM can be written in the following general 2 form.

artificial intelligence, data mining, machine learning, (22 more...)

arXiv.org Machine Learning

0908.1258

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
(2 more...)

Add feedback

Node discovery problem for a social network

Maeno, Yoshiharu

arXiv.org Artificial IntelligenceAug-7-2009

Methods to solve a node discovery problem for a social network are presented. Covert nodes refer to the nodes which are not observable directly. They transmit the influence and affect the resulting collaborative activities among the persons in a social network, but do not appear in the surveillance logs which record the participants of the collaborative activities. Discovering the covert nodes is identifying the suspicious logs where the covert nodes would appear if the covert nodes became overt. The performance of the methods is demonstrated with a test dataset generated from computationally synthesized networks and a real organization.

data mining, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

0710.4975

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Law Enforcement & Public Safety > Terrorism (0.95)
Information Technology > Services (0.84)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Social Media (0.94)
(2 more...)

Add feedback

Support Vector Machine Classification with Indefinite Kernels

Luss, Ronny, d'Aspremont, Alexandre

arXiv.org Artificial IntelligenceAug-4-2009

We propose a method for support vector machine classification using indefinite kernels. Instead of directly minimizing or stabilizing a nonconvex loss function, our algorithm simultaneously computes support vectors and a proxy kernel matrix used in forming the loss. This can be interpreted as a penalized kernel learning problem where indefinite kernel matrices are treated as a noisy observations of a true Mercer kernel. Our formulation keeps the problem convex and relatively large problems can be solved efficiently using the projected gradient or analytic center cutting plane methods. We compare the performance of our technique with other methods on several classic data sets.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Artificial Intelligence

0804.0188

Country:

North America > United States (0.48)
Asia (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.48)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback