AITopics | Paisley, John

Plotting

Paisley, John

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reweighted Expectation Maximization

Dieng, Adji B., Paisley, John

arXiv.org Machine LearningJun-13-2019

Training deep generative models with maximum likelihood remains a challenge. The typical workaround is to use variational inference (VI) and maximize a lower bound to the log marginal likelihood of the data. Variational auto-encoders (VAEs) adopt this approach. They further amortize the cost of inference by using a recognition network to parameterize the variational family. Amortized VI scales approximate posterior inference in deep generative models to large datasets. However it introduces an amortization gap and leads to approximate posteriors of reduced expressivity due to the problem known as posterior collapse. In this paper, we consider expectation maximization (EM) as a paradigm for fitting deep generative models. Unlike VI, EM directly maximizes the log marginal likelihood of the data. We rediscover the importance weighted auto-encoder (IWAE) as an instance of EM and propose a new EM-based algorithm for fitting deep generative models called reweighted expectation maximization (REM). REM learns better generative models than the IWAE by decoupling the learning dynamics of the generative model and the recognition network using a separate expressive proposal found by moment matching. We compared REM to the VAE and the IWAE on several density estimation benchmarks and found it leads to significantly better performance as measured by log-likelihood.

deep learning, generative model, neural network, (15 more...)

arXiv.org Machine Learning

1906.0585

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Random Function Priors for Correlation Modeling

Zhang, Aonan, Paisley, John

arXiv.org Machine LearningMay-12-2019

The likelihood model of high dimensional data $X_n$ can often be expressed as $p(X_n|Z_n,\theta)$, where $\theta\mathrel{\mathop:}=(\theta_k)_{k\in[K]}$ is a collection of hidden features shared across objects, indexed by $n$, and $Z_n$ is a non-negative factor loading vector with $K$ entries where $Z_{nk}$ indicates the strength of $\theta_k$ used to express $X_n$. In this paper, we introduce random function priors for $Z_n$ for modeling correlations among its $K$ dimensions $Z_{n1}$ through $Z_{nK}$, which we call \textit{population random measure embedding} (PRME). Our model can be viewed as a generalized paintbox model~\cite{Broderick13} using random functions, and can be learned efficiently with neural networks via amortized variational inference. We derive our Bayesian nonparametric method by applying a representation theorem on separately exchangeable discrete random measures.

artificial intelligence, neural network, random measure, (18 more...)

arXiv.org Machine Learning

1905.03826

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Global Explanations of Neural Networks: Mapping the Landscape of Predictions

Ibrahim, Mark, Louie, Melissa, Modarres, Ceena, Paisley, John

arXiv.org Machine LearningFeb-6-2019

A barrier to the wider adoption of neural networks is their lack of interpretability. While local explanation methods exist for one prediction, most global attributions still reduce neural network decisions to a single set of features. In response, we present an approach for generating global attributions called GAM, which explains the landscape of neural network predictions across subpopulations. GAM augments global explanations with the proportion of samples that each attribution best explains and specifies which samples are described by each attribution. Global explanations also have tunable granularity to detect more or fewer subpopulations. We demonstrate that GAM's global explanations 1) yield the known feature importances of simulated data, 2) match feature weights of interpretable statistical models on real data, and 3) are intuitive to practitioners through user studies. With more transparent predictions, GAM can help ensure neural network decisions are generated for the right reasons.

artificial intelligence, attribution, neural network, (19 more...)

arXiv.org Machine Learning

1902.02384

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Mixed Membership Recurrent Neural Networks

Fazelnia, Ghazal, Ibrahim, Mark, Modarres, Ceena, Wu, Kevin, Paisley, John

arXiv.org Machine LearningDec-22-2018

Recurrent neural networks (RNNs) have become one of the standard models in sequential data analysis [Rumelhart et al., 1986, Elman, 1990]. At each time step of the RNN, an observation is modeled via a neural network using the observations and hidden states from previous time points. Models such as the RNN, and also the hidden Markov model among others, often implicitly assume a sequence as having a fixed time interval between observations. They also often do not account for group-level effects when multiple sequences are observed and each sequence belongs to one of multiple groups. For example, consider data in the form of a sequence of discrete counts by a set of groups-- e.g., a sequence of purchases (market baskets) for a set of customers, with one sequence per customer. A vanilla RNN implementation would model these sequences using a network with the same parameters, which removes the customer-level information, and according to an enumerated indexing, which removes the time interval information between orders. However, this information is important: customer-specific effects can improve predictive performance for each customer, while an interval of one day versus one month between orders significantly impacts the items likely to be purchased next.

deep learning, neural network, sequence, (17 more...)

arXiv.org Machine Learning

1812.09645

Country: Asia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Adaptive and Calibrated Ensemble Learning with Dependent Tail-free Process

Liu, Jeremiah Zhe, Paisley, John, Kioumourtzoglou, Marianthi-Anna, Coull, Brent A.

arXiv.org Machine LearningDec-19-2018

Ensemble learning is a mainstay in modern data science practice. Conventional ensemble algorithms assigns to base models a set of deterministic, constant model weights that (1) do not fully account for variations in base model accuracy across subgroups, nor (2) provide uncertainty estimates for the ensemble prediction, which could result in mis-calibrated (i.e. precise but biased) predictions that could in turn negatively impact the algorithm performance in real-word applications. In this work, we present an adaptive, probabilistic approach to ensemble learning using dependent tail-free process as ensemble weight prior. Given input feature $\mathbf{x}$, our method optimally combines base models based on their predictive accuracy in the feature space $\mathbf{x} \in \mathcal{X}$, and provides interpretable uncertainty estimates both in model selection and in ensemble prediction. To encourage scalable and calibrated inference, we derive a structured variational inference algorithm that jointly minimize KL objective and the model's calibration score (i.e. Continuous Ranked Probability Score (CRPS)). We illustrate the utility of our method on both a synthetic nonlinear function regression task, and on the real-world application of spatio-temporal integration of particle pollution prediction models in New England.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

1812.0335

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Towards Explainable Deep Learning for Credit Lending: A Case Study

Modarres, Ceena, Ibrahim, Mark, Louie, Melissa, Paisley, John

arXiv.org Artificial IntelligenceNov-15-2018

Deep learning adoption in the financial services industry has been limited due to a lack of model interpretability. However, several techniques have been proposed to explain predictions made by a neural network. We provide an initial investigation into these techniques for the assessment of credit risk with neural networks.

deep learning, explanation, neural network, (16 more...)

arXiv.org Artificial Intelligence

1811.06471

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.31)

Industry:

Banking & Finance > Credit (1.00)
Banking & Finance > Financial Services (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Fully Supervised Speaker Diarization

Zhang, Aonan, Wang, Quan, Zhu, Zhenyao, Paisley, John, Wang, Chong

arXiv.org Machine LearningOct-27-2018

In this paper, we propose a fully supervised speaker diarization approach, named unbounded interleaved-state recurrent neural networks (UIS-RNN). Given extracted speaker-discriminative embeddings (a.k.a. d-vectors) from input utterances, each individual speaker is modeled by a parameter-sharing RNN, while the RNN states for different speakers interleave in the time domain. This RNN is naturally integrated with a distance-dependent Chinese restaurant process (ddCRP) to accommodate an unknown number of speakers. Our system is fully supervised and is able to learn from examples where time-stamped speaker labels are annotated. We achieved a 7.6% diarization error rate on NIST SRE 2000 CALLHOME, which is better than the state-of-the-art method using spectral clustering. Moreover, our method decodes in an online fashion while most state-of-the-art systems rely on offline clustering.

deep learning, neural network, utterance, (18 more...)

arXiv.org Machine Learning

1810.04719

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Add feedback

MBA: Mini-Batch AUC Optimization

Gultekin, San, Saha, Avishek, Ratnaparkhi, Adwait, Paisley, John

arXiv.org Machine LearningMay-31-2018

Area under the receiver operating characteristics curve (AUC) is an important metric for a wide range of signal processing and machine learning problems, and scalable methods for optimizing AUC have recently been proposed. However, handling very large datasets remains an open challenge for this problem. This paper proposes a novel approach to AUC maximization, based on sampling mini-batches of positive/negative instance pairs and computing U-statistics to approximate a global risk minimization problem. The resulting algorithm is simple, fast, and learning-rate free. We show that the number of samples required for good performance is independent of the number of pairs available, which is a quadratic function of the positive and negative instances. Extensive experiments show the practical utility of the proposed method.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1805.11221

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Compressed Sensing MRI Using a Recursive Dilated Network

Sun, Liyan (Xiamen University) | Fan, Zhiwen (Xiamen University) | Huang, Yue (Xiamen University) | Ding, Xinghao (Xiamen University) | Paisley, John (Columbia University)

AAAI ConferencesFeb-8-2018

Compressed sensing magnetic resonance imaging (CS-MRI) is an active research topic in the ﬁeld of inverse problems. Conventional CS-MRI algorithms usually exploit the sparse nature of MRI in an iterative manner. These optimization-based CS-MRI methods are often time-consuming at test time, and are based on ﬁxed transform bases or shallow dictionaries, which limits modeling capacity. Recently, deep models have been introduced to the CS-MRI problem. One main challenge for CS-MRI methods based on deep learning is the trade off between model performance and network size. We propose a recursive dilated network (RDN) for CS-MRI that achieves good performance while reducing the number of network parameters. We adopt dilated convolutions in each recursive block to aggregate multi-scale information within the MRI. We also adopt a modiﬁed shortcut strategy to help features ﬂow into deeper layers. Experimental results show that the proposed RDN model achieves state-of-the-art performance in CS-MRI while using far fewer parameters than previously required.

deep learning, neural network, reconstruction, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China > Fujian Province (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Variational Inference via $\chi$ Upper Bound Minimization

Dieng, Adji Bousso, Tran, Dustin, Ranganath, Rajesh, Paisley, John, Blei, David

Neural Information Processing SystemsDec-31-2017

Variational inference (VI) is widely used as an efficient alternative to Markov chain Monte Carlo. It posits a family of approximating distributions $q$ and finds the closest member to the exact posterior $p$. Closeness is usually measured via a divergence $D(q || p)$ from $q$ to $p$. While successful, this approach also has problems. Notably, it typically leads to underestimation of the posterior variance. In this paper we propose CHIVI, a black-box variational inference algorithm that minimizes $D_{\chi}(p || q)$, the $\chi$-divergence from $p$ to $q$. CHIVI minimizes an upper bound of the model evidence, which we term the $\chi$ upper bound (CUBO). Minimizing the CUBO leads to improved posterior uncertainty, and it can also be used with the classical VI lower bound (ELBO) to provide a sandwich estimate of the model evidence. We study CHIVI on three models: probit regression, Gaussian process classification, and a Cox process model of basketball plays. When compared to expectation propagation and classical VI, CHIVI produces better error rates and more accurate estimates of posterior variance.

artificial intelligence, chivi, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Leisure & Entertainment > Sports > Basketball (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback