Directed Networks
Boosted Generative Models
Grover, Aditya, Ermon, Stefano
We propose a novel approach for using unsupervised boosting to create an ensemble of generative models, where models are trained in sequence to correct earlier mistakes. Our meta-algorithmic framework can leverage any existing base learner that permits likelihood evaluation, including recent deep expressive models. Further, our approach allows the ensemble to include discriminative models trained to distinguish real data from model-generated data. We show theoretical conditions under which incorporating a new model in the ensemble will improve the fit and empirically demonstrate the effectiveness of our black-box boosting algorithms on density estimation, classification, and sample generation on benchmark datasets for a wide range of generative models.
Intelligent System To Analyze Feedback Sentiments
This paper enlightens the way companies can design Intelligent System to understand their customers' sentiments better to improve their experience, which will help the businesses change their market position. Sentiment analysis is widely acknowledged in the web and social media monitoring. It allows businesses to gain a comprehensive public opinion on the organization and its services. The ability to deduce insights from the text and emoticons from social media is a practice that is now widely adopted by the organizations worldwide. Digital media represents an extensive opportunity for businesses of any industry to acquire the needs, opinions and intent that users share on social media and web.
Choosing the best language to build your AI chatbot
No, this is not about whether you want your virtual agent to understand English slang, the subjunctive tense in Spanish or even the dozens of ways to say "I" in Japanese. In fact, the programming language you build your bot with is as important as the human language it understands. But how do you differentiate between them? Facebook, Slack and Telegram all support the most popular languages, while API platforms such as Dialogflow, LUIS and wit.ai offer SDKs for the majority. Of course, the caveat should always be to veer toward the language you are most comfortable with, but for those dipping their toe into the programming pond for the first time, a clear winner starts to emerge.
Model selection for Gaussian processes utilizing sensitivity of posterior predictive distribution
Paananen, Topi, Piironen, Juho, Andersen, Michael Riis, Vehtari, Aki
We propose two novel methods for simplifying Gaussian process (GP) models by examining the predictions of a full model in the vicinity of the training points and thereby ordering the covariates based on their predictive relevance. Our results on synthetic and real world data sets demonstrate improved variable selection compared to automatic relevance determination (ARD) in terms of consistency and predictive performance. We expect our proposed methods to be useful in interpreting and understanding complex Gaussian process models.
Inverse Ising problem in continuous time: A latent variable approach
Donner, Christian, Opper, Manfred
In recent years, the inverse Ising problem, i.e. the reconstruction of couplings and external fields of an Ising model from samples of spin configurations, has attracted considerable interest in the physics community [1]. This is due to the fact that Ising models play an important role for data modeling with applications to neural spike data [2, 3], protein structure determination [4], and gene expression analysis [5]. Much effort has been devoted to the development of algorithms for the static inverse Ising problem. This is a nontrivial task, because statistically efficient, likelihood based methods become computationally infeasible by the intractability of the partition function of the model. Hence one has to resort to either approximate inference methods or to other statistical estimators such as pseudo-likelihood methods [6], or the interaction screening algorithm [7]. The situation is somewhat simpler for the dynamical inverse Ising problem, which recently attracted attention [8-13]. If one assumes a Markovian dynamics, the exact normalisation of the spin transition probabilities allows for an explicit computation of the likelihood if one has a complete set of observed data over time. Nevertheless, the model parameters enter the likelihood in a fairly complex way, and the application of more advanced statistical approaches such as Bayesian inference again becomes a nontrivial task. This is especially true for the continuous time kinetic Ising model where the spins are governed by Glauber dynamics [14].
Model-Based Clustering of Time-Evolving Networks through Temporal Exponential-Family Random Graph Models
Lee, Kevin H., Xue, Lingzhou, Hunter, David R.
Dynamic networks are a general language for describing time-evolving complex systems, and discrete time network models provide an emerging statistical technique for various applications. It is a fundamental research question to detect the community structure in time-evolving networks. However, due to significant computational challenges and difficulties in modeling communities of time-evolving networks, there is little progress in the current literature to effectively find communities in time-evolving networks. In this work, we propose a novel model-based clustering framework for time-evolving networks based on discrete time exponential-family random graph models. To choose the number of communities, we use conditional likelihood to construct an effective model selection criterion. Furthermore, we propose an efficient variational expectation-maximization (EM) algorithm to find approximate maximum likelihood estimates of network parameters and mixing proportions. By using variational methods and minorization-maximization (MM) techniques, our method has appealing scalability for large-scale time-evolving networks. The power of our method is demonstrated in simulation studies and empirical applications to international trade networks and the collaboration networks of a large American research university.
The Recycling Gibbs Sampler for Efficient Learning
Martino, Luca, Elvira, Victor, Camps-Valls, Gustau
Monte Carlo methods are essential tools for Bayesian inference. Gibbs sampling is a well-known Markov chain Monte Carlo (MCMC) algorithm, extensively used in signal processing, machine learning, and statistics, employed to draw samples from complicated high-dimensional posterior distributions. The key point for the successful application of the Gibbs sampler is the ability to draw efficiently samples from the full-conditional probability density functions. Since in the general case this is not possible, in order to speed up the convergence of the chain, it is required to generate auxiliary samples whose information is eventually disregarded. In this work, we show that these auxiliary samples can be recycled within the Gibbs estimators, improving their efficiency with no extra cost. This novel scheme arises naturally after pointing out the relationship between the standard Gibbs sampler and the chain rule used for sampling purposes. Numerical simulations involving simple and real inference problems confirm the excellent performance of the proposed scheme in terms of accuracy and computational efficiency. In particular we give empirical evidence of performance in a toy example, inference of Gaussian processes hyperparameters, and learning dependence graphs through regression.
Data Wonderland: Christmas songs from the viewpoint of a data scientist
Whether โDriving Home for Christmas", โWinter Wonderland", โLet it snow!" or โLast Christmas" โ every year christmas songs are taking over the charts again. While average Joe is joyfully putting on the next christmas song, the data scientist starts his journey of discovery through the snowy music history. The data set comes from 55000 Song Lyrics, which contains over 55,000 songs. Our goal is to perform a comprehensive analysis of the song texts to identify the Christmas songs. In order to do so, first we add an additional column to the data frame to give each song a label of either Christmas or Not Christmas, where every song which contains the words Christmas, Xmas or X-mas will be labeled as Christmas and otherwise as Not Christmas. This is just the initialization of the labels, later we will apply Naive Bayes to a training set to identify the other Christmas songs.
Hyperparameters Optimization in Deep Convolutional Neural Network / Bayesian Approach with Gaussian Process Prior
Convolutional Neural Network is known as ConvNet have been extensively used in many complex machine learning tasks. However, hyperparameters optimization is one of a crucial step in developing ConvNet architectures, since the accuracy and performance are totally reliant on the hyperparameters. This multilayered architecture parameterized by a set of hyperparameters such as the number of convolutional layers, number of fully connected dense layers & neurons, the probability of dropout implementation, learning rate. Hence the searching the hyperparameter over the hyperparameter space are highly difficult to build such complex hierarchical architecture. Many methods have been proposed over the decade to explore the hyperparameter space and find the optimum set of hyperparameter values. Reportedly, Gird search and Random search are said to be inefficient and extremely expensive, due to a large number of hyperparameters of the architecture. Hence, Sequential model-based Bayesian Optimization is a promising alternative technique to address the extreme of the unknown cost function. The recent study on Bayesian Optimization by Snoek in nine convolutional network parameters is achieved the lowerest error report in the CIFAR-10 benchmark. This article is intended to provide the overview of the mathematical concept behind the Bayesian Optimization over a Gaussian prior.
Approximate Profile Maximum Likelihood
Pavlichin, Dmitri S., Jiao, Jiantao, Weissman, Tsachy
We propose an efficient algorithm for approximate computation of the profile maximum likelihood (PML), a variant of maximum likelihood maximizing the probability of observing a sufficient statistic rather than the empirical sample. The PML has appealing theoretical properties, but is difficult to compute exactly. Inspired by observations gleaned from exactly solvable cases, we look for an approximate PML solution, which, intuitively, clumps comparably frequent symbols into one symbol. This amounts to lower-bounding a certain matrix permanent by summing over a subgroup of the symmetric group rather than the whole group during the computation. We extensively experiment with the approximate solution, and find the empirical performance of our approach is competitive and sometimes significantly better than state-of-the-art performance for various estimation problems.