AITopics | Xing, Eric

Plotting

Xing, Eric

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Nonparametric Variational Auto-encoders for Hierarchical Representation Learning

Goyal, Prasoon, Hu, Zhiting, Liang, Xiaodan, Wang, Chenyu, Xing, Eric

arXiv.org Machine LearningAug-25-2017

The recently developed variational autoencoders (VAEs) have proved to be an effective confluence of the rich representational power of neural networks with Bayesian methods. However, most work on VAEs use a rather simple prior over the latent variables such as standard normal distribution, thereby restricting its applications to relatively simple phenomena. In this work, we propose hierarchical nonparametric variational autoencoders, which combines tree-structured Bayesian nonparametric priors with VAEs, to enable infinite flexibility of the latent representation space. Both the neural parameters and Bayesian priors are learned jointly using tailored variational inference. The resulting model induces a hierarchical structure of latent semantic concepts underlying the data corpus, and infers accurate representations of data instances. We apply our model in video representation learning. Our method is able to discover highly interpretable activity hierarchies, and obtain improved clustering accuracy and generalization capacity based on the learned rich representations.

deep learning, neural network, node, (18 more...)

arXiv.org Machine Learning

1703.07027

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Post-Inference Prior Swapping

Neiswanger, Willie, Xing, Eric

arXiv.org Artificial IntelligenceJul-12-2017

While Bayesian methods are praised for their ability to incorporate useful prior knowledge, in practice, convenient priors that allow for computationally cheap or tractable inference are commonly used. In this paper, we investigate the following question: for a given model, is it possible to compute an inference result with any convenient false prior, and afterwards, given any target prior of interest, quickly transform this result into the target posterior? A potential solution is to use importance sampling (IS). However, we demonstrate that IS will fail for many choices of the target prior, depending on its parametric form and similarity to the false prior. Instead, we propose prior swapping, a method that leverages the pre-inferred false posterior to efficiently generate accurate posterior samples under arbitrary target priors. Prior swapping lets us apply less-costly inference algorithms to certain models, and incorporate new or updated prior information "post-inference". We give theoretical guarantees about our method, and demonstrate it empirically on a number of models and priors.

artificial intelligence, bayesian inference, posterior, (18 more...)

arXiv.org Artificial Intelligence

1606.00787

Country:

Asia (0.14)
Oceania > Australia (0.14)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Recurrent Estimation of Distributions

Oliva, Junier B., Dubey, Kumar Avinava, Poczos, Barnabas, Xing, Eric, Schneider, Jeff

arXiv.org Machine LearningMay-30-2017

This paper presents the recurrent estimation of distributions (RED) for modeling real-valued data in a semiparametric fashion. RED models make two novel uses of recurrent neural networks (RNNs) for density estimation of general real-valued data. First, RNNs are used to transform input covariates into a latent space to better capture conditional dependencies in inputs. After, an RNN is used to compute the conditional distributions of the latent covariates. The resulting model is efficient to train, compute, and sample from, whilst producing normalized pdfs. The effectiveness of RED is shown via several real-world data experiments. Our results show that RED models achieve a lower held-out negative log-likelihood than other neural network approaches across multiple dataset sizes and dimensionalities. Further context of the efficacy of RED is provided by considering anomaly detection tasks, where we also observe better performance over alternative models.

deep learning, neural network, transformation, (20 more...)

arXiv.org Machine Learning

1705.1075

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Harnessing Deep Neural Networks with Logic Rules

Hu, Zhiting, Ma, Xuezhe, Liu, Zhengzhong, Hovy, Eduard, Xing, Eric

arXiv.org Machine LearningNov-15-2016

Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, we develop an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. We deploy the framework on a CNN for sentiment analysis, and an RNN for named entity recognition. With a few highly intuitive rules, we obtain substantial improvements and achieve state-of-the-art or comparable results to previous best-performing systems.

artificial intelligence, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1603.06318

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Latent Variable Modeling with Diversity-Inducing Mutual Angular Regularization

Xie, Pengtao, Deng, Yuntian, Xing, Eric

arXiv.org Machine LearningDec-22-2015

Latent Variable Models (LVMs) are a large family of machine learning models providing a principled and effective way to extract underlying patterns, structure and knowledge from observed data. Due to the dramatic growth of volume and complexity of data, several new challenges have emerged and cannot be effectively addressed by existing LVMs: (1) How to capture long-tail patterns that carry crucial information when the popularity of patterns is distributed in a power-law fashion? (2) How to reduce model complexity and computational cost without compromising the modeling power of LVMs? (3) How to improve the interpretability and reduce the redundancy of discovered patterns? To addresses the three challenges discussed above, we develop a novel regularization technique for LVMs, which controls the geometry of the latent space during learning to enable the learned latent components of LVMs to be diverse in the sense that they are favored to be mutually different from each other, to accomplish long-tail coverage, low redundancy, and better interpretability. We propose a mutual angular regularizer (MAR) to encourage the components in LVMs to have larger mutual angles. The MAR is non-convex and non-smooth, entailing great challenges for optimization. To cope with this issue, we derive a smooth lower bound of the MAR and optimize the lower bound instead. We show that the monotonicity of the lower bound is closely aligned with the MAR to qualify the lower bound as a desirable surrogate of the MAR. Using neural network (NN) as an instance, we analyze how the MAR affects the generalization performance of NN. On two popular latent variable models --- restricted Boltzmann machine and distance metric learning, we demonstrate that MAR can effectively capture long-tail patterns, reduce model complexity without sacrificing expressivity and improve interpretability.

deep learning, latent variable modeling, neural network, (15 more...)

arXiv.org Machine Learning

1512.07336

Country:

Asia > Middle East > Iraq (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Japan > Honshū (0.14)

Genre: Research Report (0.81)

Industry: Health & Medicine > Health Care Technology > Medical Record (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Scalable Gaussian Processes for Characterizing Multidimensional Change Surfaces

Herlands, William, Wilson, Andrew, Nickisch, Hannes, Flaxman, Seth, Neill, Daniel, van Panhuis, Wilbert, Xing, Eric

arXiv.org Machine LearningNov-13-2015

We present a scalable Gaussian process model for identifying and characterizing smooth multidimensional changepoints, and automatically learning changes in expressive covariance structure. We use Random Kitchen Sink features to flexibly define a change surface in combination with expressive spectral mixture kernels to capture the complex statistical structure. Finally, through the use of novel methods for additive non-separable kernels, we can scale the model to large datasets. We demonstrate the model on numerical and real world data, including a large spatio-temporal disease dataset where we identify previously unknown heterogeneous changes in space and time.

artificial intelligence, health & medicine, kernel, (17 more...)

arXiv.org Machine Learning

1511.04408

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.52)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Modeling & Simulation (0.67)

Add feedback

Embarrassingly Parallel Variational Inference in Nonconjugate Models

Neiswanger, Willie, Wang, Chong, Xing, Eric

arXiv.org Machine LearningOct-14-2015

We develop a parallel variational inference (VI) procedure for use in data-distributed settings, where each machine only has access to a subset of data and runs VI independently, without communicating with other machines. This type of "embarrassingly parallel" procedure has recently been developed for MCMC inference algorithms; however, in many cases it is not possible to directly extend this procedure to VI methods without requiring certain restrictive exponential family conditions on the form of the model. Furthermore, most existing (nonparallel) VI methods are restricted to use on conditionally conjugate models, which limits their applicability. To combat these issues, we make use of the recently proposed nonparametric VI to facilitate an embarrassingly parallel VI procedure that can be applied to a wider scope of models, including to nonconjugate models. We derive our embarrassingly parallel VI algorithm, analyze our method theoretically, and demonstrate our method empirically on a few nonconjugate models.

approximation, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1510.04163

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

Mining User Interests from Personal Photos

Xie, Pengtao (Carnegie Mellon University) | Pei, Yulong (Carnegie Mellon University) | Xie, Yuan (Carnegie Mellon University) | Xing, Eric (Carnegie Mellon University)

AAAI ConferencesMar-6-2015

Personal photos are enjoying explosive growth with the popularity of photo-taking devices and social media. The vast amount of online photos largely exhibit users' interests, emotion and opinions. Mining user interests from personal photos can boost a number of utilities, such as advertising, interest based community detection and photo recommendation. In this paper, we study the problem of user interests mining from personal photos. We propose a User Image Latent Space Model to jointly model user interests and image contents. User interests are modeled as latent factors and each user is assumed to have a distribution over them. By inferring the latent factors and users' distributions, we can discover what the users are interested in. We model image contents with a four-level hierarchical structure where the layers correspond to themes, semantic regions, visual words and pixels respectively. Users' latent interests are embedded in the theme layer. Given image contents, users' interests can be discovered by doing posterior inference. We use variational inference to approximate the posteriors of latent variables and learn model parameters. Experiments on 180K Flickr photos demonstrate the effectiveness of our model.

artificial intelligence, social media, user interest, (18 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Industry: Information Technology > Services (0.50)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.48)

Add feedback

Cauchy Principal Component Analysis

Xie, Pengtao, Xing, Eric

arXiv.org Machine LearningDec-19-2014

Principal Component Analysis (PCA) has wide applications in machine learning, text mining and computer vision. Classical PCA based on a Gaussian noise model is fragile to noise of large magnitude. Laplace noise assumption based PCA methods cannot deal with dense noise effectively. In this paper, we propose Cauchy Principal Component Analysis (Cauchy PCA), a very simple yet effective PCA method which is robust to various types of noise. We utilize Cauchy distribution to model noise and derive Cauchy PCA under the maximum likelihood estimation (MLE) framework with low rank constraint. Our method can robustly estimate the low rank matrix regardless of whether noise is large or small, dense or sparse. We analyze the robustness of Cauchy PCA from a robust statistics view and present an efficient singular value projection optimization method. Experimental results on both simulated data and real applications demonstrate the robustness of Cauchy PCA to various noise patterns.

artificial intelligence, machine learning, pca, (18 more...)

arXiv.org Machine Learning

1412.6506

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

Add feedback

Fast Function to Function Regression

Oliva, Junier, Neiswanger, Willie, Poczos, Barnabas, Xing, Eric, Schneider, Jeff

arXiv.org Machine LearningOct-27-2014

We analyze the problem of regression when both input covariates and output responses are functions from a nonparametric function class. Function to function regression (FFR) covers a large range of interesting applications including time-series prediction problems, and also more general tasks like studying a mapping between two separate types of distributions. However, previous nonparametric estimators for FFR type problems scale badly computationally with the number of input/output pairs in a data-set. Given the complexity of a mapping between general functions it may be necessary to consider large data-sets in order to achieve a low estimation risk. To address this issue, we develop a novel scalable nonparametric estimator, the Triple-Basis Estimator (3BE), which is capable of operating over datasets with many instances. To the best of our knowledge, the 3BE is the first nonparametric FFR estimator that can scale to massive datasets. We analyze the 3BE's risk and derive an upperbound rate. Furthermore, we show an improvement of several orders of magnitude in terms of prediction speed and a reduction in error over previous estimators in various real-world data-sets.

artificial intelligence, data mining, estimator, (18 more...)

arXiv.org Machine Learning

1410.7414

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.88)

Add feedback