AITopics | Engelhardt, Barbara E.

Plotting

Engelhardt, Barbara E.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unsupervised Domain Adaptation Using Approximate Label Matching

Ash, Jordan T., Schapire, Robert E., Engelhardt, Barbara E.

arXiv.org Artificial IntelligenceMar-1-2017

Domain adaptation addresses the problem created when training data is generated by a so-called source distribution, but test data is generated by a significantly different target distribution. In this work, we present approximate label matching (ALM), a new unsupervised domain adaptation technique that creates and leverages a rough labeling on the test samples, then uses these noisy labels to learn a transformation that aligns the source and target samples. We show that the transformation estimated by ALM has favorable properties compared to transformations estimated by other methods, which do not use any kind of target labeling. Our model is regularized by requiring that a classifier trained to discriminate source from transformed target samples cannot distinguish between the two. We experiment with ALM on simulated and real data, and show that it outperforms techniques commonly used in the field.

deep learning, neural network, transformation, (19 more...)

arXiv.org Artificial Intelligence

1602.04889

Country: North America > United States (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Coupled Compound Poisson Factorization

Basbug, Mehmet E., Engelhardt, Barbara E.

arXiv.org Machine LearningJan-8-2017

We present a general framework, the coupled compound Poisson factorization (CCPF), to capture the missing-data mechanism in extremely sparse data sets by coupling a hierarchical Poisson factorization with an arbitrary data-generating model. We derive a stochastic variational inference algorithm for the resulting model and, as examples of our framework, implement three different data-generating models---a mixture model, linear regression, and factor analysis---to robustly model non-random missing data in the context of clustering, prediction, and matrix factorization. In all three cases, we test our framework against models that ignore the missing-data mechanism on large scale studies with non-random missing data, and we show that explicitly modeling the missing-data mechanism substantially improves the quality of the results, as measured using data log likelihood on a held-out test set.

artificial intelligence, bayesian inference, data generating model, (17 more...)

arXiv.org Machine Learning

1701.02058

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Dynamic Collaborative Filtering with Compound Poisson Factorization

Jerfel, Ghassen, Basbug, Mehmet E., Engelhardt, Barbara E.

arXiv.org Machine LearningNov-1-2016

Model-based collaborative filtering analyzes user-item interactions to infer latent factors that represent user preferences and item characteristics in order to predict future interactions. Most collaborative filtering algorithms assume that these latent factors are static, although it has been shown that user preferences and item perceptions drift over time. In this paper, we propose a conjugate and numerically stable dynamic matrix factorization (DCPF) based on compound Poisson matrix factorization that models the smoothly drifting latent factors using Gamma-Markov chains. We propose a numerically stable Gamma chain construction, and then present a stochastic variational inference approach to estimate the parameters of our model. We apply our model to time-stamped ratings data sets: Netflix, Yelp, and Last.fm, where DCPF achieves a higher predictive accuracy than state-of-the-art static and dynamic factorization models.

artificial intelligence, latent factor, machine learning, (17 more...)

arXiv.org Machine Learning

1608.04839

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Hierarchical Compound Poisson Factorization

Basbug, Mehmet E., Engelhardt, Barbara E.

arXiv.org Machine LearningMay-26-2016

Non-negative matrix factorization models based on a hierarchical Gamma-Poisson structure capture user and item behavior effectively in extremely sparse data sets, making them the ideal choice for collaborative filtering applications. Hierarchical Poisson factorization (HPF) in particular has proved successful for scalable recommendation systems with extreme sparsity. HPF, however, suffers from a tight coupling of sparsity model (absence of a rating) and response model (the value of the rating), which limits the expressiveness of the latter. Here, we introduce hierarchical compound Poisson factorization (HCPF) that has the favorable Gamma-Poisson structure and scalability of HPF to high-dimensional extremely sparse matrices. More importantly, HCPF decouples the sparsity model from the response model, allowing us to choose the most suitable distribution for the response. HCPF can capture binary, non-negative discrete, non-negative continuous, and zero-inflated continuous responses. We compare HCPF with HPF on nine discrete and three continuous data sets and conclude that HCPF captures the relationship between sparsity and response better than HPF.

artificial intelligence, health & medicine, response model, (18 more...)

arXiv.org Machine Learning

1604.03853

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry:

Media (0.71)
Leisure & Entertainment (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)

Add feedback

Differential gene co-expression networks via Bayesian biclustering models

Gao, Chuan, Zhao, Shiwen, McDowell, Ian C., Brown, Christopher D., Engelhardt, Barbara E.

arXiv.org Machine LearningNov-7-2014

Identifying latent structure in large data matrices is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are locally co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes whose covariation may be observed in only a subset of the samples. Our biclustering method, BicMix, has desirable properties, including allowing overcomplete representations of the data, computational tractability, and jointly modeling unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios. Further, we develop a method to recover gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and recover a gene co-expression network that is differential across ER+ and ER- samples.

bayesian inference, co-expression network, oncology, (22 more...)

arXiv.org Machine Learning

1411.1997

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.49)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (0.94)
(3 more...)

Add feedback