AITopics | Industry

Collaborating Authors

Industry

High-dimensional variable selection

arXiv.org Machine LearningAug-20-2009

This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression methods. In the first stage we fit a set of candidate models. In the second stage we select one model by cross-validation. In the third stage we use hypothesis testing to eliminate some variables. We refer to the first two stages as "screening" and the last stage as "cleaning." We consider three screening methods: the lasso, marginal regression, and forward stepwise regression. Our method gives consistent variable selection under certain conditions.

artificial intelligence, assumption, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/08-AOS646

0704.1139

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Dynamic quantum clustering: a method for visual exploration of structures in data

Weinstein, Marvin, Horn, David

arXiv.org Machine LearningAug-18-2009

A given set of data-points in some feature space may be associated with a Schrodinger equation whose potential is determined by the data. This is known to lead to good clustering solutions. Here we extend this approach into a full-fledged dynamical scheme using a time-dependent Schrodinger equation. Moreover, we approximate this Hamiltonian formalism by a truncated calculation within a set of Gaussian wave functions (coherent states) centered around the original points. This allows for analytic evaluation of the time evolution of all such states, opening up the possibility of exploration of relationships among data-points through observation of varying dynamical-distances among points and convergence of points into clusters. This formalism may be further supplemented by preprocessing, such as dimensional reduction through singular value decomposition or feature filtering.

dqc evolution, evolution, svd-entropy, (14 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.80.066117

0908.2644

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Classification by Set Cover: The Prototype Vector Machine

Bien, Jacob, Tibshirani, Robert

arXiv.org Machine LearningAug-17-2009

We introduce a new nearest-prototype classifier, the prototype vector machine (PVM). It arises from a combinatorial optimization problem which we cast as a variant of the set cover problem. We propose two algorithms for approximating its solution. The PVM selects a relatively small number of representative points which can then be used for classification. It contains 1-NN as a special case. The method is compatible with any dissimilarity measure, making it amenable to situations in which the data are not embedded in an underlying feature space or in which using a non-Euclidean metric is desirable. Indeed, we demonstrate on the much studied ZIP code data how the PVM can reap the benefits of a problem-specific metric. In this example, the PVM outperforms the highly successful 1-NN with tangent distance, and does so retaining fewer than half of the data points. This example highlights the strengths of the PVM in yielding a low-error, highly interpretable model. Additionally, we apply the PVM to a protein classification problem in which a kernel-based distance is used.

artificial intelligence, machine learning, prototype, (16 more...)

arXiv.org Machine Learning

0908.2284

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Statistical ranking and combinatorial Hodge theory

Jiang, Xiaoye, Lim, Lek-Heng, Yao, Yuan, Ye, Yinyu

arXiv.org Machine LearningAug-10-2009

We propose a number of techniques for obtaining a global ranking from data that may be incomplete and imbalanced -- characteristics almost universal to modern datasets coming from e-commerce and internet applications. We are primarily interested in score or rating-based cardinal data. From raw ranking data, we construct pairwise rankings, represented as edge flows on an appropriate graph. Our statistical ranking method uses the graph Helmholtzian, the graph theoretic analogue of the Helmholtz operator or vector Laplacian, in much the same way the graph Laplacian is an analogue of the Laplace operator or scalar Laplacian. We study the graph Helmholtzian using combinatorial Hodge theory: we show that every edge flow representing pairwise ranking can be resolved into two orthogonal components, a gradient flow that represents the L2-optimal global ranking and a divergence-free flow (cyclic) that measures the validity of the global ranking obtained -- if this is large, then the data does not have a meaningful global ranking. This divergence-free flow can be further decomposed orthogonally into a curl flow (locally cyclic) and a harmonic flow (locally acyclic but globally cyclic); these provides information on whether inconsistency arises locally or globally. An obvious advantage over the NP-hard Kemeny optimization is that discrete Hodge decomposition may be computed via a linear least squares regression. We also investigated the L1-projection of edge flows, showing that this is dual to correlation maximization over bounded divergence-free flows, and the L1-approximate sparse cyclic ranking, showing that this is dual to correlation maximization over bounded curl-free flows. We discuss relations with Kemeny optimization, Borda count, and Kendall-Smith consistency index from social choice theory and statistics.

artificial intelligence, machine learning, ranking, (18 more...)

arXiv.org Machine Learning

0811.1067

Country: North America > United States > California (0.67)

Genre: Research Report > New Finding (0.92)

Industry:

Banking & Finance (1.00)
Leisure & Entertainment (0.94)
Media > Film (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Add feedback

Discrete Temporal Models of Social Networks

Hanneke, Steve, Fu, Wenjie, Xing, Eric

arXiv.org Machine LearningAug-9-2009

The field of social network analysis is concerned with populations of actors, interconnected by a set of relations (e.g., friendship, communication, etc.). These relationships can be concisely described by directed graphs, with one vertex for each actor and an edge for each relation between a pair of actors. This network representation of a population can provide insight into organizational structures, social behavior patterns, emergence of global structure from local dynamics, and a variety of other social phenomena. There has been increasing demand for flexible statistical models of social networks, for the purposes of scientific exploration and as a basis for practical analysis and data mining tools. The subject of modeling a static social network has been investigated in some depth. For time-invariant networks, represented as a single directed or undirected graph, a number of flexible statistical models have been proposed, including the classic Exponential Random Graph Models (ERGM) and extensions (Frank and Strauss, 1986; Wasserman and Robins, 2005; Snijders, 2002; Robins and Pattison, 2005), which are descriptive in nature, latent space models that aim towards clustering and community discovery (Handcock and Raftery, 2007), and mixed-membership block models for role discovery (Airoldi et al., 2008). Of particular relevance to this paper is the ERGM, which is particularly flexible in that it can be customized to capture a wide range of signature connectivity patterns in the network via user-specified functions representing their sufficient statistics. Specifically, if N is some representation of a social network, and N is the set of all possible networks in this representation, then the probability distribution function for any ERGM can be written in the following general 2 form.

artificial intelligence, data mining, machine learning, (22 more...)

arXiv.org Machine Learning

0908.1258

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
(2 more...)

Add feedback

Node discovery problem for a social network

Maeno, Yoshiharu

arXiv.org Artificial IntelligenceAug-7-2009

Methods to solve a node discovery problem for a social network are presented. Covert nodes refer to the nodes which are not observable directly. They transmit the influence and affect the resulting collaborative activities among the persons in a social network, but do not appear in the surveillance logs which record the participants of the collaborative activities. Discovering the covert nodes is identifying the suspicious logs where the covert nodes would appear if the covert nodes became overt. The performance of the methods is demonstrated with a test dataset generated from computationally synthesized networks and a real organization.

data mining, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

0710.4975

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Law Enforcement & Public Safety > Terrorism (0.95)
Information Technology > Services (0.84)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Social Media (0.94)
(2 more...)

Add feedback

Support Vector Machine Classification with Indefinite Kernels

Luss, Ronny, d'Aspremont, Alexandre

arXiv.org Artificial IntelligenceAug-4-2009

We propose a method for support vector machine classification using indefinite kernels. Instead of directly minimizing or stabilizing a nonconvex loss function, our algorithm simultaneously computes support vectors and a proxy kernel matrix used in forming the loss. This can be interpreted as a penalized kernel learning problem where indefinite kernel matrices are treated as a noisy observations of a true Mercer kernel. Our formulation keeps the problem convex and relatively large problems can be solved efficiently using the projected gradient or analytic center cutting plane methods. We compare the performance of our technique with other methods on several classic data sets.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Artificial Intelligence

0804.0188

Country:

North America > United States (0.48)
Asia (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.48)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Streamed Learning: One-Pass SVMs

Rai, Piyush, Daumé, Hal III, Venkatasubramanian, Suresh

arXiv.org Machine LearningAug-4-2009

We present a streaming model for large-scale classification (in the context of $\ell_2$-SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The $\ell_2$-SVM is known to have an equivalent formulation in terms of the minimum enclosing ball (MEB) problem, and an efficient algorithm based on the idea of \emph{core sets} exists (Core Vector Machine, CVM). CVM learns a $(1+\varepsilon)$-approximate MEB for a set of points and yields an approximate solution to corresponding SVM instance. However CVM works in batch mode requiring multiple passes over the data. This paper presents a single-pass SVM which is based on the minimum enclosing ball of streaming data. We show that the MEB updates for the streaming case can be easily adapted to learn the SVM weight vector in a way similar to using online stochastic gradient updates. Our algorithm performs polylogarithmic computation at each example, and requires very small and constant storage. Experimental results show that, even in such restrictive settings, we can learn efficiently in just one pass and get accuracies comparable to other state-of-the-art SVM solvers (batch and online). We also give an analysis of the algorithm, and discuss some open issues and possible extensions.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

0908.0572

Country: North America > United States (0.49)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.87)

Add feedback

The Infinite Hierarchical Factor Regression Model

Rai, Piyush, Daumé, Hal III

arXiv.org Machine LearningAug-4-2009

We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman's coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis.

artificial intelligence, machine learning, matrix, (13 more...)

arXiv.org Machine Learning

0908.0570

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area (0.50)
Health & Medicine > Pharmaceuticals & Biotechnology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.85)

Add feedback

Knowledge Discovery of Hydrocyclone s Circuit Based on SONFIS and SORST

Ghaffari, H. O., Ejtemaei, M., Irannajad, M.

arXiv.org Artificial IntelligenceAug-1-2009

This study describes application of some approximate reasoning methods to analysis of hydrocyclone performance. In this manner, using a combining of Self Organizing Map (SOM), Neuro-Fuzzy Inference System (NFIS)-SONFIS- and Rough Set Theory (RST)-SORST-crisp and fuzzy granules are obtained. Balancing of crisp granules and non-crisp granules can be implemented in close-open iteration. Using different criteria and based on granulation level balance point (interval) or a pseudo-balance point is estimated. Validation of the proposed methods, on the data set of the hydrocyclone is rendered.

fuzzy logic, granule, upstream oil & gas, (17 more...)

arXiv.org Artificial Intelligence

0908.0089

Country:

Asia > Middle East > Iran (0.15)
Asia > Middle East > Republic of Türkiye (0.15)

Genre: Research Report (0.84)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback