AITopics | Country

Collaborating Authors

Country

A hierarchical Dirichlet process mixture model for haplotype reconstruction from multi-population data

arXiv.org Machine LearningAug-20-2009

The perennial problem of "how many clusters?" remains an issue of substantial interest in data mining and machine learning communities, and becomes particularly salient in large data sets such as populational genomic data where the number of clusters needs to be relatively large and open-ended. This problem gets further complicated in a co-clustering scenario in which one needs to solve multiple clustering problems simultaneously because of the presence of common centroids (e.g., ancestors) shared by clusters (e.g., possible descents from a certain ancestor) from different multiple-cluster samples (e.g., different human subpopulations). In this paper we present a hierarchical nonparametric Bayesian model to address this problem in the context of multi-population haplotype inference. Uncovering the haplotypes of single nucleotide polymorphisms is essential for many biological and medical applications. While it is uncommon for the genotype data to be pooled from multiple ethnically distinct populations, few existing programs have explicitly leveraged the individual ethnic information for haplotype inference. In this paper we present a new haplotype inference program, Haploi, which makes use of such information and is readily applicable to genotype sequences with thousands of SNPs from heterogeneous populations, with competent and sometimes superior speed and accuracy comparing to the state-of-the-art programs. Underlying Haploi is a new haplotype distribution model based on a nonparametric Bayesian formalism known as the hierarchical Dirichlet process, which represents a tractable surrogate to the coalescent process. The proposed model is exchangeable, unbounded, and capable of coupling demographic information of different populations.

artificial intelligence, haplotype, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/08-AOAS225

0812.4648

Country: North America > United States > Pennsylvania (0.28)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

High-dimensional variable selection

Wasserman, Larry, Roeder, Kathryn

arXiv.org Machine LearningAug-20-2009

This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression methods. In the first stage we fit a set of candidate models. In the second stage we select one model by cross-validation. In the third stage we use hypothesis testing to eliminate some variables. We refer to the first two stages as "screening" and the last stage as "cleaning." We consider three screening methods: the lasso, marginal regression, and forward stepwise regression. Our method gives consistent variable selection under certain conditions.

artificial intelligence, assumption, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/08-AOS646

0704.1139

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Dynamic quantum clustering: a method for visual exploration of structures in data

Weinstein, Marvin, Horn, David

arXiv.org Machine LearningAug-18-2009

A given set of data-points in some feature space may be associated with a Schrodinger equation whose potential is determined by the data. This is known to lead to good clustering solutions. Here we extend this approach into a full-fledged dynamical scheme using a time-dependent Schrodinger equation. Moreover, we approximate this Hamiltonian formalism by a truncated calculation within a set of Gaussian wave functions (coherent states) centered around the original points. This allows for analytic evaluation of the time evolution of all such states, opening up the possibility of exploration of relationships among data-points through observation of varying dynamical-distances among points and convergence of points into clusters. This formalism may be further supplemented by preprocessing, such as dimensional reduction through singular value decomposition or feature filtering.

dqc evolution, evolution, svd-entropy, (14 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.80.066117

0908.2644

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Compiling Uncertainty Away in Conformant Planning Problems with Bounded Width

Palacios, H., Geffner, H.

Journal of Artificial Intelligence ResearchAug-17-2009

Conformant planning is the problem of finding a sequence of actions for achieving a goal in the presence of uncertainty in the initial state or action effects. The problem has been approached as a path-finding problem in belief space where good belief representations and heuristics are critical for scaling up. In this work, a different formulation is introduced for conformant problems with deterministic actions where they are automatically converted into classical ones and solved by an off-the-shelf classical planner. The translation maps literals L and sets of assumptions t about the initial situation, into new literals KL/t that represent that L must be true if t is initially true. We lay out a general translation scheme that is sound and establish the conditions under which the translation is also complete. We show that the complexity of the complete translation is exponential in a parameter of the problem called the conformant width, which for most benchmarks is bounded. The planner based on this translation exhibits good performance in comparison with existing planners, and is the basis for T0, the best performing planner in the Conformant Track of the 2006 International Planning Competition.

initial situation, initial state, translation, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2708

AI Access Foundation

10618

Journal of Artificial Intelligence Research

Country: Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)

Add feedback

Classification by Set Cover: The Prototype Vector Machine

Bien, Jacob, Tibshirani, Robert

arXiv.org Machine LearningAug-17-2009

We introduce a new nearest-prototype classifier, the prototype vector machine (PVM). It arises from a combinatorial optimization problem which we cast as a variant of the set cover problem. We propose two algorithms for approximating its solution. The PVM selects a relatively small number of representative points which can then be used for classification. It contains 1-NN as a special case. The method is compatible with any dissimilarity measure, making it amenable to situations in which the data are not embedded in an underlying feature space or in which using a non-Euclidean metric is desirable. Indeed, we demonstrate on the much studied ZIP code data how the PVM can reap the benefits of a problem-specific metric. In this example, the PVM outperforms the highly successful 1-NN with tangent distance, and does so retaining fewer than half of the data points. This example highlights the strengths of the PVM in yielding a low-error, highly interpretable model. Additionally, we apply the PVM to a protein classification problem in which a kernel-based distance is used.

artificial intelligence, machine learning, prototype, (16 more...)

arXiv.org Machine Learning

0908.2284

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

View-based Propagator Derivation

Schulte, Christian, Tack, Guido

arXiv.org Artificial IntelligenceAug-14-2009

When implementing a propagator for a constraint, one must decide about variants: When implementing min, should one also implement max? Should one implement linear constraints both with unit and non-unit coefficients? Constraint variants are ubiquitous: implementing them requires considerable (if not prohibitive) effort and decreases maintainability, but will deliver better performance than resorting to constraint decomposition. This paper shows how to use views to derive perfect propagator variants. A model for views and derived propagators is introduced. Derived propagators are proved to be indeed perfect in that they inherit essential properties such as correctness and domain and bounds consistency. Techniques for systematically deriving propagators such as transformation, generalization, specialization, and type conversion are developed. The paper introduces an implementation architecture for views that is independent of the underlying constraint programming system. A detailed evaluation of views implemented in Gecode shows that derived propagators are efficient and that views often incur no overhead. Without views, Gecode would either require 180 000 rather than 40 000 lines of propagator code, or would lack many efficient propagator variants. Compared to 8 000 lines of code for views, the reduction in code for propagators yields a 1750% return on investment.

artificial intelligence, programming language, propagator, (18 more...)

arXiv.org Artificial Intelligence

0908.2050

Country:

North America > United States (0.67)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

Statistical ranking and combinatorial Hodge theory

Jiang, Xiaoye, Lim, Lek-Heng, Yao, Yuan, Ye, Yinyu

arXiv.org Machine LearningAug-10-2009

We propose a number of techniques for obtaining a global ranking from data that may be incomplete and imbalanced -- characteristics almost universal to modern datasets coming from e-commerce and internet applications. We are primarily interested in score or rating-based cardinal data. From raw ranking data, we construct pairwise rankings, represented as edge flows on an appropriate graph. Our statistical ranking method uses the graph Helmholtzian, the graph theoretic analogue of the Helmholtz operator or vector Laplacian, in much the same way the graph Laplacian is an analogue of the Laplace operator or scalar Laplacian. We study the graph Helmholtzian using combinatorial Hodge theory: we show that every edge flow representing pairwise ranking can be resolved into two orthogonal components, a gradient flow that represents the L2-optimal global ranking and a divergence-free flow (cyclic) that measures the validity of the global ranking obtained -- if this is large, then the data does not have a meaningful global ranking. This divergence-free flow can be further decomposed orthogonally into a curl flow (locally cyclic) and a harmonic flow (locally acyclic but globally cyclic); these provides information on whether inconsistency arises locally or globally. An obvious advantage over the NP-hard Kemeny optimization is that discrete Hodge decomposition may be computed via a linear least squares regression. We also investigated the L1-projection of edge flows, showing that this is dual to correlation maximization over bounded divergence-free flows, and the L1-approximate sparse cyclic ranking, showing that this is dual to correlation maximization over bounded curl-free flows. We discuss relations with Kemeny optimization, Borda count, and Kendall-Smith consistency index from social choice theory and statistics.

artificial intelligence, machine learning, ranking, (18 more...)

arXiv.org Machine Learning

0811.1067

Country: North America > United States > California (0.67)

Genre: Research Report > New Finding (0.92)

Industry:

Banking & Finance (1.00)
Leisure & Entertainment (0.94)
Media > Film (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Add feedback

Discrete Temporal Models of Social Networks

Hanneke, Steve, Fu, Wenjie, Xing, Eric

arXiv.org Machine LearningAug-9-2009

The field of social network analysis is concerned with populations of actors, interconnected by a set of relations (e.g., friendship, communication, etc.). These relationships can be concisely described by directed graphs, with one vertex for each actor and an edge for each relation between a pair of actors. This network representation of a population can provide insight into organizational structures, social behavior patterns, emergence of global structure from local dynamics, and a variety of other social phenomena. There has been increasing demand for flexible statistical models of social networks, for the purposes of scientific exploration and as a basis for practical analysis and data mining tools. The subject of modeling a static social network has been investigated in some depth. For time-invariant networks, represented as a single directed or undirected graph, a number of flexible statistical models have been proposed, including the classic Exponential Random Graph Models (ERGM) and extensions (Frank and Strauss, 1986; Wasserman and Robins, 2005; Snijders, 2002; Robins and Pattison, 2005), which are descriptive in nature, latent space models that aim towards clustering and community discovery (Handcock and Raftery, 2007), and mixed-membership block models for role discovery (Airoldi et al., 2008). Of particular relevance to this paper is the ERGM, which is particularly flexible in that it can be customized to capture a wide range of signature connectivity patterns in the network via user-specified functions representing their sufficient statistics. Specifically, if N is some representation of a social network, and N is the set of all possible networks in this representation, then the probability distribution function for any ERGM can be written in the following general 2 form.

artificial intelligence, data mining, machine learning, (22 more...)

arXiv.org Machine Learning

0908.1258

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
(2 more...)

Add feedback

Node discovery problem for a social network

Maeno, Yoshiharu

arXiv.org Artificial IntelligenceAug-7-2009

Methods to solve a node discovery problem for a social network are presented. Covert nodes refer to the nodes which are not observable directly. They transmit the influence and affect the resulting collaborative activities among the persons in a social network, but do not appear in the surveillance logs which record the participants of the collaborative activities. Discovering the covert nodes is identifying the suspicious logs where the covert nodes would appear if the covert nodes became overt. The performance of the methods is demonstrated with a test dataset generated from computationally synthesized networks and a real organization.

data mining, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

0710.4975

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Law Enforcement & Public Safety > Terrorism (0.95)
Information Technology > Services (0.84)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Social Media (0.94)
(2 more...)

Add feedback

Support Vector Machine Classification with Indefinite Kernels

Luss, Ronny, d'Aspremont, Alexandre

arXiv.org Artificial IntelligenceAug-4-2009

We propose a method for support vector machine classification using indefinite kernels. Instead of directly minimizing or stabilizing a nonconvex loss function, our algorithm simultaneously computes support vectors and a proxy kernel matrix used in forming the loss. This can be interpreted as a penalized kernel learning problem where indefinite kernel matrices are treated as a noisy observations of a true Mercer kernel. Our formulation keeps the problem convex and relatively large problems can be solved efficiently using the projected gradient or analytic center cutting plane methods. We compare the performance of our technique with other methods on several classic data sets.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Artificial Intelligence

0804.0188

Country:

North America > United States (0.48)
Asia (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.48)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback