AITopics

A Nested HDP for Hierarchical Topic Models

Paisley, John, Wang, Chong, Blei, David, Jordan, Michael I.

We develop a nested hierarchical Dirichlet process (nHDP) for hierarchical topic modeling. The nHDP is a generalization of the nested Chinese restaurant process (nCRP) that allows each word to follow its own path to a topic node according to a document-specific distribution on a shared tree. This alleviates the rigid, single-path formulation of the nCRP, allowing a document to more easily express thematic borrowings as a random effect. We demonstrate our algorithm on 1.8 million documents from The New York Times.

artificial intelligence, natural language, ncrp, (13 more...)

1301.357

Country:

Europe (1.00)
Asia > Middle East > Jordan (0.17)
Asia > Middle East > Lebanon (0.15)

Genre: Research Report (0.40)

Industry:

Government > Military (0.97)
Banking & Finance (0.96)
Leisure & Entertainment > Sports > Football (0.30)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.41)

Multi-agent learning using Fictitious Play and Extended Kalman Filter

Smyrnakis, Michalis

Decentralised optimisation tasks are important components of multi-agent systems. These tasks can be interpreted as n-player potential games: therefore game-theoretic learning algorithms can be used to solve decentralised optimisation tasks. Fictitious play is the canonical example of these algorithms. Nevertheless fictitious play implicitly assumes that players have stationary strategies. We present a novel variant of fictitious play where players predict their opponents' strategies using Extended Kalman filters and use their predictions to update their strategies. We show that in 2 by 2 games with at least one pure Nash equilibrium and in potential games where players have two available actions, the proposed algorithm converges to the pure Nash equilibrium. The performance of the proposed algorithm was empirically tested, in two strategic form games and an ad-hoc sensor network surveillance problem. The proposed algorithm performs better than the classic fictitious play algorithm in these games and therefore improves the performance of game-theoretical learning in decentralised optimisation.

artificial intelligence, machine learning, nash equilibrium, (19 more...)

1301.3347

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

An Efficient Sufficient Dimension Reduction Method for Identifying Genetic Variants of Clinical Significance

Xiong, Momiao, Ma, Long

Fast and cheaper next generation sequencing technologies will generate unprecedentedly massive and highly-dimensional genomic and epigenomic variation data. In the near future, a routine part of medical record will include the sequenced genomes. A fundamental question is how to efficiently extract genomic and epigenomic variants of clinical utility which will provide information for optimal wellness and interference strategies. Traditional paradigm for identifying variants of clinical validity is to test association of the variants. However, significantly associated genetic variants may or may not be usefulness for diagnosis and prognosis of diseases. Alternative to association studies for finding genetic variants of predictive utility is to systematically search variants that contain sufficient information for phenotype prediction. To achieve this, we introduce concepts of sufficient dimension reduction and coordinate hypothesis which project the original high dimensional data to very low dimensional space while preserving all information on response phenotypes. We then formulate clinically significant genetic variant discovery problem into sparse SDR problem and develop algorithms that can select significant genetic variants from up to or even ten millions of predictors with the aid of dividing SDR for whole genome into a number of subSDR problems defined for genomic regions. The sparse SDR is in turn formulated as sparse optimal scoring problem, but with penalty which can remove row vectors from the basis matrix. To speed up computation, we develop the modified alternating direction method for multipliers to solve the sparse optimal scoring problem which can easily be implemented in parallel. To illustrate its application, the proposed method is applied to simulation data and the NHLBI's Exome Sequencing Project dataset

artificial intelligence, machine learning, variant, (13 more...)

1301.3528

Country: North America > United States > Texas (0.16)

Genre: Research Report > Experimental Study (0.36)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.65)

Dinalankara, Wikum, Bravo, Hector Corrada

Anomaly Classification with the Anti-Profile Support Vector Machine

We introduce the anti-profile Support Vector Machine (apSVM) as a novel algorithm to address the anomaly classification problem, an extension of anomaly detection where the goal is to distinguish data samples from a number of anomalous and heterogeneous classes based on their pattern of deviation from a normal stable class. We show that under heterogeneity assumptions defined here that the apSVM can be solved as the dual of a standard SVM with an indirect kernel that measures similarity of anomalous samples through similarity to the stable normal class. We characterize this indirect kernel as the inner product in a Reproducing Kernel Hilbert Space between representers that are projected to the subspace spanned by the representers of the normal samples. We show by simulation and application to cancer genomics datasets that the anti-profile SVM produces classifiers that are more accurate and stable than the standard SVM in the anomaly classification setting.

artificial intelligence, classifier, machine learning, (14 more...)

1301.3514

Country:

North America > United States > Wisconsin (0.28)
North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Goldberg, Yair, Kosorok, Michael R.

Support Vector Regression for Right Censored Data

arXiv.org Machine LearningJan-12-2013

In many medical studies, estimating the failure time distribution function, or quantities that depend on this distribution, as a function of patient demographic and prognostic variables, is of central importance for risk assessment and health planing. Frequently, such data is subject to right censoring. The goal of this paper is to develop tools for analyzing such data using machine learning techniques. Traditional approaches to right censored failure time analysis include using parametric models, such as the Weibull distribution, and semiparametric models such as proportional hazard models (see Lawless, 2003, for both). Even when less stringent models--such as nonparametric estimation--are used, it is typically assumed that the distribution function is smooth in both time and covariates (Dabrowska, 1987; Gonzalez-Manteiga and Cadarso-Suarez, 1994). These assumptions seem restrictive, especially when considering today's high-dimensional data settings.

artificial intelligence, goldberg and kosorok svm, machine learning, (16 more...)

1202.513

Country: North America > United States (0.92)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.82)

Vlachos, Ioannis, Kugiumtzis, Dimitris

Backward-in-Time Selection of the Order of Dynamic Regression Prediction Model

arXiv.org Machine LearningJan-11-2013

We investigate the optimal structure of dynamic regression models used in multivariate time series prediction and propose a scheme to form the lagged variable structure called Backward-in-Time Selection (BTS) that takes into account feedback and multi-collinearity, often present in multivariate time series. We compare BTS to other known methods, also in conjunction with regularization techniques used for the estimation of model parameters, namely principal components, partial least squares and ridge regression estimation. The predictive efficiency of the different models is assessed by means of Monte Carlo simulations for different settings of feedback and multi-collinearity. The results show that BTS has consistently good prediction performance while other popular methods have varying and often inferior performance. The prediction performance of BTS was also found the best when tested on human electroencephalograms of an epileptic seizure, and to the prediction of returns of indices of world financial markets.

artificial intelligence, machine learning, time sery, (16 more...)

1301.241

Country: North America > United States > Arizona (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Welling, Max, Teh, Yee Whye

Belief Optimization for Binary Networks: A Stable Alternative to Loopy Belief Propagation

arXiv.org Artificial IntelligenceJan-10-2013

We present a novel inference algorithm for arbitrary, binary, undirected graphs. Unlike loopy belief propagation, which iterates fixed point equations, we directly descend on the Bethe free energy. The algorithm consists of two phases, first we update the pairwise probabilities, given the marginal probabilities at each unit,using an analytic expression. Next, we update the marginal probabilities, given the pairwise probabilities by following the negative gradient of the Bethe free energy. Both steps are guaranteed to decrease the Bethe free energy, and since it is lower bounded, the algorithm is guaranteed to converge to a local minimum. We also show that the Bethe free energy is equal to the TAP free energy up to second order in the weights. In experiments we confirm that when belief propagation converges it usually finds identical solutions as our belief optimization method. However, in cases where belief propagation fails to converge, belief optimization continues to converge to reasonable beliefs. The stable nature of belief optimization makes it ideally suited for learning graphical models from data.

artificial intelligence, belief revision, free energy, (20 more...)

arXiv.org Artificial Intelligence

1301.2317

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Popescul, Alexandrin, Ungar, Lyle H., Pennock, David M, Lawrence, Steve

Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments

arXiv.org Machine LearningJan-10-2013

Recommender systems leverage product and community information to target products to consumers. Researchers have developed collaborative recommenders, content-based recommenders, and (largely ad-hoc) hybrid systems. We propose a unified probabilistic framework for merging collaborative and content-based recommendations. We extend Hofmann's [1999] aspect model to incorporate three-way co-occurrence data among users, items, and item content. The relative influence of collaboration data versus content data is not imposed as an exogenous parameter, but rather emerges naturally from the given data sources. Global probabilistic models coupled with standard Expectation Maximization (EM) learning algorithms tend to drastically overfit in sparse-data situations, as is typical in recommendation applications. We show that secondary content information can often be used to overcome sparsity. Experiments on data from the ResearchIndex library of Computer Science publications show that appropriate mixture models incorporating secondary data produce significantly better quality recommenders than k-nearest neighbors (k-NN). Global probabilistic models also allow more general inferences than local methods like k-NN.

artificial intelligence, information, machine learning, (15 more...)

1301.2303

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.82)

Industry:

Media (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)

Muandet, Krikamol, Balduzzi, David, Schölkopf, Bernhard

Domain Generalization via Invariant Feature Representation

arXiv.org Machine LearningJan-10-2013

This paper investigates domain generalization: How to take knowledge acquired from an arbitrary number of related domains and apply it to previously unseen domains? We propose Domain-Invariant Component Analysis (DICA), a kernel-based optimization algorithm that learns an invariant transformation by minimizing the dissimilarity across domains, whilst preserving the functional relationship between input and output variables. A learning-theoretic analysis shows that reducing dissimilarity improves the expected generalization ability of classifiers on new domains, motivating the proposed algorithm. Experimental results on synthetic and real-world datasets demonstrate that DICA successfully learns invariant features and improves classifier performance in practice.

artificial intelligence, machine learning, optimization problem, (19 more...)

1301.2115

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (0.90)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)