AITopics | Genre

Collaborating Authors

Genre

Robustness Analysis of Hottopixx, a Linear Programming Model for Factoring Nonnegative Matrices

arXiv.org Machine LearningMay-31-2013

Although nonnegative matrix factorization (NMF) is NP-hard in general, it has been shown very recently that it is tractable under the assumption that the input nonnegative data matrix is close to being separable (separability requires that all columns of the input matrix belongs to the cone spanned by a small subset of these columns). Since then, several algorithms have been designed to handle this subclass of NMF problems. In particular, Bittorf, Recht, R\'e and Tropp (`Factoring nonnegative matrices with linear programs', NIPS 2012) proposed a linear programming model, referred to as Hottopixx. In this paper, we provide a new and more general robustness analysis of their method. In particular, we design a provably more robust variant using a post-processing strategy which allows us to deal with duplicates and near duplicates in the dataset.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

doi: 10.1137/120900629

1211.6687

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Add feedback

A Survey on Latent Tree Models and Applications

Mourad, R., Sinoquet, C., Zhang, N. L., Liu, T., Leray, P.

Journal of Artificial Intelligence ResearchMay-30-2013

In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences. The latent tree model, a particular type of probabilistic graphical models, deserves attention. Its simple structure - a tree - allows simple and efficient inference, while its latent variables capture complex relationships. In the past decade, the latent tree model has been subject to significant theoretical and methodological developments. In this review, we propose a comprehensive study of this model. First we summarize key ideas underlying the model. Second we explain how it can be efficiently learned from data. Third we illustrate its use within three types of applications: latent structure discovery, multidimensional clustering, and probabilistic inference. Finally, we conclude and give promising directions for future researches in this field.

algorithm, complexity, ltm, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3879

AI Access Foundation

10817

Journal of Artificial Intelligence Research

Country:

Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
(8 more...)

Genre:

Overview (0.87)
Research Report (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)

Add feedback

Predicting the Severity of Breast Masses with Data Mining Methods

Mokhtar, Sahar A., Elsayad, Alaa. M.

arXiv.org Machine LearningMay-30-2013

Mammography is the most effective and available tool for breast cancer screening. However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to approximately 70% unnecessary biopsies with benign outcomes. Data mining algorithms could be used to help physicians in their decisions to perform a breast biopsy on a suspicious lesion seen in a mammogram image or to perform a short term follow-up examination instead. In this research paper data mining classification algorithms; Decision Tree (DT), Artificial Neural Network (ANN), and Support Vector Machine (SVM) are analyzed on mammographic masses data set. The purpose of this study is to increase the ability of physicians to determine the severity (benign or malignant) of a mammographic mass lesion from BI-RADS attributes and the patient,s age. The whole data set is divided for training the models and test them by the ratio of 70:30% respectively and the performances of classification algorithms are compared through three statistical measures; sensitivity, specificity, and classification accuracy. Accuracy of DT, ANN and SVM are 78.12%, 80.56% and 81.25% of test samples respectively. Our analysis shows that out of these three classification models SVM predicts severity of breast cancer with least error rate and highest accuracy.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1305.7057

Country: North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (0.49)
Research Report > New Finding (0.47)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.77)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Non-linear dimensionality reduction: Riemannian metric estimation and the problem of geometric discovery

Perraul-Joncas, Dominique, Meila, Marina

arXiv.org Machine LearningMay-30-2013

In recent years, manifold learning has become increasingly popular as a tool for performing non-linear dimensionality reduction. This has led to the development of numerous algorithms of varying degrees of complexity that aim to recover man ifold geometry using either local or global features of the data. Building on the Laplacian Eigenmap and Diffusionmaps framework, we propose a new paradigm that offers a guarantee, under reasonable assumptions, that any manifo ld learning algorithm will preserve the geometry of a data set. Our approach is based on augmenting the output of embedding algorithms with geometric informatio n embodied in the Riemannian metric of the manifold. We provide an algorithm for estimating the Riemannian metric from data and demonstrate possible application s of our approach in a variety of examples.

artificial intelligence, machine learning, manifold, (16 more...)

arXiv.org Machine Learning

1305.7255

Country: North America > United States (1.00)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Add feedback

Structural and Functional Discovery in Dynamic Networks with Non-negative Matrix Factorization

Mankad, Shawn, Michailidis, George

arXiv.org Machine LearningMay-30-2013

Due to advances in data collection technologies, it is becoming increasingly common to study time series of networks. An important research question is how to discover the underlying structure and dynamics in time-varying networked systems. In this work, we propose a new matrix factorization-based approach for community discovery and visual exploration within potentially weighted and directed network time-series. Next, we review and discuss this work in relation to popular approaches for addressing the key problems of community detection and visualization of time series of networks. There have been many important contributions for community detection in network time-series, extensively reviewed in [1, 2], from the fields of physics, computer science and statistics. The basic goal of community detection is to extract groups of nodes that feature relatively dense within group connectivity and sparser between group connections [3, 4]. A common strategy is to embed the graphs in low-dimensional latent spaces. For instance, [5] use latent variables to capture groups of papers that evolve similarly in citation network data.

data mining, machine learning, node, (19 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.88.042812

1305.7169

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Networks (0.48)
Telecommunications > Networks (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Rotation invariants of two dimensional curves based on iterated integrals

Diehl, Joscha

arXiv.org Machine LearningMay-29-2013

We introduce a novel class of rotation invariants of two dimensional curves based on iterated integrals. The invariants we present are in some sense complete and we describe an algorithm to calculate them, giving explicit computations up to order six. We present an application to online (stroke-trajectory based) character recognition. This seems to be the first time in the literature that the use of iterated integrals of a curve is proposed for (invariant) feature extraction in machine learning applications.

invariant, machine learning, pattern recognition, (11 more...)

arXiv.org Machine Learning

1305.6883

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

Add feedback

The Expressive Power of Word Embeddings

Chen, Yanqing, Perozzi, Bryan, Al-Rfou, Rami, Skiena, Steven

arXiv.org Machine LearningMay-29-2013

We seek to better understand the difference in quality of the several publicly released embeddings. We propose several tasks that help to distinguish the characteristics of different embeddings. Our evaluation of sentiment polarity and synonym/antonym relations shows that embeddings are able to capture surprisingly nuanced semantics even in the absence of sentence structure. Moreover, benchmarking the embeddings shows great variance in quality and characteristics of the semantics captured by the tested embeddings. Finally, we show the impact of varying the number of dimensions and the resolution of each dimension on the effective useful features captured by the embedding space. Our contributions highlight the importance of embeddings for NLP tasks and the effect of their quality on the final results.

dimension, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1301.3226

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.50)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

An estimation of distribution algorithm with adaptive Gibbs sampling for unconstrained global optimization

Velasco, Jonás, Saucedo-Espinosa, Mario A., Escalante, Hugo Jair, Mendoza, Karlo, Villarreal-Rodríguez, César Emilio, Chacón-Mondragón, Óscar L., Rodríguez, Adrián, Berrones, Arturo

arXiv.org Machine LearningMay-29-2013

In this paper is proposed a new heuristic approach belonging to the field of evolutionary Estimation of Distribution Algorithms (EDAs). EDAs builds a probability model and a set of solutions is sampled from the model which characterizes the distribution of such solutions. The main framework of the proposed method is an estimation of distribution algorithm, in which an adaptive Gibbs sampling is used to generate new promising solutions and, in combination with a local search strategy, it improves the individual solutions produced in each iteration. The Estimation of Distribution Algorithm with Adaptive Gibbs Sampling we are proposing in this paper is called AGEDA. We experimentally evaluate and compare this algorithm against two deterministic procedures and several stochastic methods in three well known test problems for unconstrained global optimization. It is empirically shown that our heuristic is robust in problems that involve three central aspects that mainly determine the difficulty of global optimization problems, namely high-dimensionality, multi-modality and non-smoothness.

artificial intelligence, machine learning, unconstrained global optimization, (5 more...)

arXiv.org Machine Learning

1107.2104

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.80)

Add feedback

Normalized Online Learning

Ross, Stephane, Mineiro, Paul, Langford, John

arXiv.org Machine LearningMay-28-2013

We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale. This has several useful effects: there is no need to pre-normalize data, the test-time and test-space complexity are reduced, and the algorithms are more robust.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1305.6646

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)

Add feedback

Reinforcement Learning for the Soccer Dribbling Task

Carvalho, Arthur, Oliveira, Renato

arXiv.org Machine LearningMay-28-2013

We propose a reinforcement learning solution to the \emph{soccer dribbling task}, a scenario in which a soccer agent has to go from the beginning to the end of a region keeping possession of the ball, as an adversary attempts to gain possession. While the adversary uses a stationary policy, the dribbler learns the best action to take at each decision point. After defining meaningful variables to represent the state space, and high-level macro-actions to incorporate domain knowledge, we describe our application of the reinforcement learning algorithm \emph{Sarsa} with CMAC for function approximation. Our experiments show that, after the training period, the dribbler is able to accomplish its task against a strong adversary around 58% of the time.

dribbler, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/CIG.2011.6031994

1305.6568

Country: Europe (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.35)

Add feedback