AITopics

doi: 10.1613/jair.3834

AI Access Foundation

10818

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Austria > Vienna (0.04)

Genre:

Research Report (0.46)
Contests & Prizes (0.34)

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Jeopardy! (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)
(2 more...)

arXiv.org Machine LearningMay-31-2013

Joint Modeling and Registration of Cell Populations in Cohorts of High-Dimensional Flow Cytometric Data

Pyne, Saumyadipta, Wang, Kui, Irish, Jonathan, Tamayo, Pablo, Nazaire, Marc-Danie, Duong, Tarn, Lee, Sharon, Ng, Shu-Kay, Hafler, David, Levy, Ronald, Nolan, Garry, Mesirov, Jill, McLachlan, Geoffrey J.

In systems biomedicine, an experimenter encounters different potential sources of variation in data such as individual samples, multiple experimental conditions, and multi-variable network-level responses. In multiparametric cytometry, which is often used for analyzing patient samples, such issues are critical. While computational methods can identify cell populations in individual samples, without the ability to automatically match them across samples, it is difficult to compare and characterize the populations in typical experiments, such as those responding to various stimulations or distinctive of particular patients or time-points, especially when there are many samples. Joint Clustering and Matching (JCM) is a multi-level framework for simultaneous modeling and registration of populations across a cohort. JCM models every population with a robust multivariate probability distribution. Simultaneously, JCM fits a random-effects model to construct an overall batch template -- used for registering populations across samples, and classifying new samples. By tackling systems-level variation, JCM supports practical biomedical applications involving large cohorts.

artificial intelligence, machine learning, template, (18 more...)

doi: 10.1371/journal.pone.0100334

1305.7344

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Scott, James G., Sun, Liang

Expectation-maximization for logistic regression

arXiv.org Machine LearningMay-31-2013

We present a family of expectation-maximization (EM) algorithms for binary and negative-binomial logistic regression, drawing a sharp connection with the variational-Bayes algorithm of Jaakkola and Jordan (2000). Indeed, our results allow a version of this variational-Bayes approach to be re-interpreted as a true EM algorithm. We study several interesting features of the algorithm, and of this previously unrecognized connection with variational Bayes. We also generalize the approach to sparsity-promoting priors, and to an online method whose convergence properties are easily established. This latter method compares favorably with stochastic-gradient descent in situations with marked collinearity.

algorithm, artificial intelligence, machine learning, (16 more...)

1306.004

Country: Asia > Middle East > Jordan (0.24)

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Feyereisl, Jan, Aickelin, Uwe

Privileged Information for Data Clustering

arXiv.org Machine LearningMay-31-2013

Many machine learning algorithms assume that all input samples are independently and identically distributed from some common distribution on either the input space X, in the case of unsupervised learning, or the input and output space X x Y in the case of supervised and semi-supervised learning. In the last number of years the relaxation of this assumption has been explored and the importance of incorporation of additional information within machine learning algorithms became more apparent. Traditionally such fusion of information was the domain of semi-supervised learning. More recently the inclusion of knowledge from separate hypothetical spaces has been proposed by Vapnik as part of the supervised setting. In this work we are interested in exploring Vapnik's idea of master-class learning and the associated learning using privileged information, however within the unsupervised setting. Adoption of the advanced supervised learning paradigm for the unsupervised setting instigates investigation into the difference between privileged and technical data. By means of our proposed aRi-MAX method stability of the KMeans algorithm is improved and identification of the best clustering solution is achieved on an artificial dataset. Subsequently an information theoretic dot product based algorithm called P-Dot is proposed. This method has the ability to utilize a wide variety of clustering techniques, individually or in combination, while fusing privileged and technical data for improved clustering. Application of the P-Dot method to the task of digit recognition confirms our findings in a real-world scenario.

artificial intelligence, information, machine learning, (14 more...)

1305.7454

Country: North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Journal of Artificial Intelligence ResearchMay-30-2013

A Survey on Latent Tree Models and Applications

Mourad, R., Sinoquet, C., Zhang, N. L., Liu, T., Leray, P.

In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences. The latent tree model, a particular type of probabilistic graphical models, deserves attention. Its simple structure - a tree - allows simple and efficient inference, while its latent variables capture complex relationships. In the past decade, the latent tree model has been subject to significant theoretical and methodological developments. In this review, we propose a comprehensive study of this model. First we summarize key ideas underlying the model. Second we explain how it can be efficiently learned from data. Third we illustrate its use within three types of applications: latent structure discovery, multidimensional clustering, and probabilistic inference. Finally, we conclude and give promising directions for future researches in this field.

algorithm, complexity, ltm, (17 more...)

doi: 10.1613/jair.3879

AI Access Foundation

10817

Country:

Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
(8 more...)

Genre:

Overview (0.87)
Research Report (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)

Mokhtar, Sahar A., Elsayad, Alaa. M.

Predicting the Severity of Breast Masses with Data Mining Methods

arXiv.org Machine LearningMay-30-2013

Mammography is the most effective and available tool for breast cancer screening. However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to approximately 70% unnecessary biopsies with benign outcomes. Data mining algorithms could be used to help physicians in their decisions to perform a breast biopsy on a suspicious lesion seen in a mammogram image or to perform a short term follow-up examination instead. In this research paper data mining classification algorithms; Decision Tree (DT), Artificial Neural Network (ANN), and Support Vector Machine (SVM) are analyzed on mammographic masses data set. The purpose of this study is to increase the ability of physicians to determine the severity (benign or malignant) of a mammographic mass lesion from BI-RADS attributes and the patient,s age. The whole data set is divided for training the models and test them by the ratio of 70:30% respectively and the performances of classification algorithms are compared through three statistical measures; sensitivity, specificity, and classification accuracy. Accuracy of DT, ANN and SVM are 78.12%, 80.56% and 81.25% of test samples respectively. Our analysis shows that out of these three classification models SVM predicts severity of breast cancer with least error rate and highest accuracy.

algorithm, artificial intelligence, machine learning, (17 more...)

1305.7057

Country: North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (0.49)
Research Report > New Finding (0.47)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.77)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Perraul-Joncas, Dominique, Meila, Marina

Non-linear dimensionality reduction: Riemannian metric estimation and the problem of geometric discovery

arXiv.org Machine LearningMay-30-2013

In recent years, manifold learning has become increasingly popular as a tool for performing non-linear dimensionality reduction. This has led to the development of numerous algorithms of varying degrees of complexity that aim to recover man ifold geometry using either local or global features of the data. Building on the Laplacian Eigenmap and Diffusionmaps framework, we propose a new paradigm that offers a guarantee, under reasonable assumptions, that any manifo ld learning algorithm will preserve the geometry of a data set. Our approach is based on augmenting the output of embedding algorithms with geometric informatio n embodied in the Riemannian metric of the manifold. We provide an algorithm for estimating the Riemannian metric from data and demonstrate possible application s of our approach in a variety of examples.

artificial intelligence, machine learning, manifold, (16 more...)

1305.7255

Country: North America > United States (1.00)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Mankad, Shawn, Michailidis, George

Structural and Functional Discovery in Dynamic Networks with Non-negative Matrix Factorization

arXiv.org Machine LearningMay-30-2013

Due to advances in data collection technologies, it is becoming increasingly common to study time series of networks. An important research question is how to discover the underlying structure and dynamics in time-varying networked systems. In this work, we propose a new matrix factorization-based approach for community discovery and visual exploration within potentially weighted and directed network time-series. Next, we review and discuss this work in relation to popular approaches for addressing the key problems of community detection and visualization of time series of networks. There have been many important contributions for community detection in network time-series, extensively reviewed in [1, 2], from the fields of physics, computer science and statistics. The basic goal of community detection is to extract groups of nodes that feature relatively dense within group connectivity and sparser between group connections [3, 4]. A common strategy is to embed the graphs in low-dimensional latent spaces. For instance, [5] use latent variables to capture groups of papers that evolve similarly in citation network data.

data mining, machine learning, node, (19 more...)

doi: 10.1103/PhysRevE.88.042812

1305.7169

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Networks (0.48)
Telecommunications > Networks (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Chen, Yanqing, Perozzi, Bryan, Al-Rfou, Rami, Skiena, Steven

The Expressive Power of Word Embeddings

arXiv.org Machine LearningMay-29-2013

We seek to better understand the difference in quality of the several publicly released embeddings. We propose several tasks that help to distinguish the characteristics of different embeddings. Our evaluation of sentiment polarity and synonym/antonym relations shows that embeddings are able to capture surprisingly nuanced semantics even in the absence of sentence structure. Moreover, benchmarking the embeddings shows great variance in quality and characteristics of the semantics captured by the tested embeddings. Finally, we show the impact of varying the number of dimensions and the resolution of each dimension on the effective useful features captured by the embedding space. Our contributions highlight the importance of embeddings for NLP tasks and the effect of their quality on the final results.

dimension, machine learning, natural language, (18 more...)

1301.3226

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.50)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Ross, Stephane, Mineiro, Paul, Langford, John

Normalized Online Learning

arXiv.org Machine LearningMay-28-2013

We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale. This has several useful effects: there is no need to pre-normalize data, the test-time and test-space complexity are reduced, and the algorithms are more robust.

algorithm, artificial intelligence, machine learning, (17 more...)

1305.6646

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)