AITopics

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.

artificial intelligence, machine learning, natural language, (17 more...)

1610.09034

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Masood, M. Arjumand, Doshi-Velez, Finale

Rapid Posterior Exploration in Bayesian Non-negative Matrix Factorization

Non-negative Matrix Factorization (NMF) is a popular tool for data exploration. Bayesian NMF promises to also characterize uncertainty in the factorization. Unfortunately, current inference approaches such as MCMC mix slowly and tend to get stuck on single modes. We introduce a novel approach using rapidly-exploring random trees (RRTs) to asymptotically cover regions of high posterior density. These are placed in a principled Bayesian framework via an online extension to nonparametric variational inference. On experiments on real and synthetic data, we obtain greater coverage of the posterior and higher ELBO values than standard NMF inference approaches.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

1610.08928

Genre: Research Report (0.84)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Statistical Inference for Model Parameters in Stochastic Gradient Descent

Chen, Xi, Lee, Jason D., Tong, Xin T., Zhang, Yichen

The stochastic gradient descent (SGD) algorithm has been widely used in statistical estimation for large-scale data due to its computational and memory efficiency. While most existing work focuses on the convergence of the objective function or the error of the obtained solution, we investigate the problem of statistical inference of the true model parameters based on SGD. To this end, we propose two consistent estimators of the asymptotic covariance of the average iterate from SGD: (1) an intuitive plug-in estimator and (2) a computationally more efficient batch-means estimator, which only uses the iterates from SGD. As the SGD process forms a time-inhomogeneous Markov chain, our batch-means estimator with carefully chosen increasing batch sizes generalizes the classical batch-means estimator designed for time-homogenous Markov chains. The proposed batch-means estimator is of independent interest, which can be potentially used for estimating the covariance of other time-inhomogeneous Markov chains. Both proposed estimators allow us to construct asymptotically exact confidence intervals and hypothesis tests. We further discuss an extension to conducting inference based on SGD for high-dimensional linear regression. Using a variant of the SGD algorithm, we construct a debiased estimator of each regression coefficient that is asymptotically normal. This gives a one-pass algorithm for computing both the sparse regression coefficient estimator and confidence intervals, which is computationally attractive and applicable to online data.

artificial intelligence, estimator, machine learning, (16 more...)

1610.08637

Country: North America > United States (0.45)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)

Causal Network Learning from Multiple Interventions of Unknown Manipulated Targets

He, Yango, Geng, Zhi

In this paper, we discuss structure learning of causal networks from multiple data sets obtained by external intervention experiments where we do not know what variables are manipulated. For example, the conditions in these experiments are changed by changing temperature or using drugs, but we do not know what target variables are manipulated by the external interventions. From such data sets, the structure learning becomes more difficult. For this case, we first discuss the identifiability of causal structures. Next we present a graph-merging method for learning causal networks for the case that the sample sizes are large for these interventions. Then for the case that the sample sizes of these interventions are relatively small, we propose a data-pooling method for learning causal networks in which we pool all data sets of these interventions together for the learning. Further we propose a re-sampling approach to evaluate the edges of the causal network learned by the data-pooling method. Finally we illustrate the proposed learning methods by simulations.

artificial intelligence, intervention, machine learning, (18 more...)

1610.08611

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Things Bayes can't do

Ryabko, Daniil

The problem of forecasting conditional probabilities of the next event given the past is considered in a general probabilistic setting. Given an arbitrary (large, uncountable) set C of predictors, we would like to construct a single predictor that performs asymptotically as well as the best predictor in C, on any data. Here we show that there are sets C for which such predictors exist, but none of them is a Bayesian predictor with a prior concentrated on C. In other words, there is a predictor with sublinear regret, but every Bayesian predictor must have a linear regret. This negative finding is in sharp contrast with previous results that establish the opposite for the case when one of the predictors in $C$ achieves asymptotically vanishing error. In such a case, if there is a predictor that achieves asymptotically vanishing error for any measure in C, then there is a Bayesian predictor that also has this property, and whose prior is concentrated on (a countable subset of) C.

artificial intelligence, machine learning, predictor, (18 more...)

doi: 10.1007/978-3-319-46379-7_17

1610.08239

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Molstad, Aaron J., Rothman, Adam J.

A penalized likelihood method for classification with matrix-valued predictors

We propose a penalized likelihood method to fit the linear discriminant analysis model when the predictor is matrix valued. We simultaneously estimate the means and the precision matrix, which we assume has a Kronecker product decomposition. Our penalties encourage pairs of response category mean matrices to have equal entries and also encourage zeros in the precision matrix. To compute our estimators, we use a blockwise coordinate descent algorithm. To update the optimization variables corresponding to response category mean matrices, we use an alternating minimization algorithm that takes advantage of the Kronecker structure of the precision matrix. We show that our method can outperform relevant competitors in classification, even when our modeling assumptions are violated. We analyze an EEG dataset to demonstrate our method's interpretability and classification accuracy.

algorithm, artificial intelligence, machine learning, (17 more...)

1609.07386

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

@machinelearnbotOct-26-2016, 18:50:50 GMT

A methodology for solving problems with DataScience for Internet of Things - Part One

Real-time systems differ in the way they perform analytics. Specifically, Real-time systems perform analytics on short time windows for Data Streams. Hence, the scope of Real Time analytics is a'window' which typically comprises of the last few time slots. Making Predictions on Real Time Data streams involves building an Offline model and applying it to a stream. Models incorporate one or more machine learning algorithms which are trained using the training Data.

data mining, machine learning, real time system, (19 more...)

@machinelearnbot

Industry: Information Technology > Smart Houses & Appliances (0.44)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.48)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.30)

#artificialintelligenceOct-26-2016, 18:50:36 GMT

The 10 Algorithms Machine Learning Engineers Need to Know

It is no doubt that the sub-field of machine learning / artificial intelligence has increasingly gained more popularity in the past couple of years. As Big Data is the hottest trend in the tech industry at the moment, machine learning is incredibly powerful to make predictions or calculated suggestions based on large amounts of data. Some of the most common examples of machine learning are Netflix's algorithms to make movie suggestions based on movies you have watched in the past or Amazon's algorithms that recommend books based on books you have bought before. So if you want to learn more about machine learning, how do you start? For me, my first introduction is when I took an Artificial Intelligence class when I was studying abroad in Copenhagen. My lecturer is a full-time Applied Math and CS professor at the Technical University of Denmark, in which his research areas are logic and artificial, focusing primarily on the use of logic to model human-like planning, reasoning and problem solving.

artificial intelligence, learning, machine learning, (13 more...)

#artificialintelligence

Country:

Europe > Denmark > Capital Region > Copenhagen (0.25)
North America > United States > California > San Francisco County > San Francisco (0.05)

Industry: Information Technology > Services (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

@machinelearnbotOct-26-2016, 17:21:40 GMT

Text Classification & Sentiment Analysis tutorial / blog

Natural Language Processing (NLP) is a vast area of Computer Science that is concerned with the interaction between Computers and Human Language[1]. Within NLP many tasks are – or can be reformulated as – classification tasks. In classification tasks we are trying to produce a classification function which can give the correlation between a certain'feature' and a class . This Classifier first has to be trained with a training dataset, and then it can be used to actually classify documents. Training means that we have to determine its model parameters.

machine learning, natural language, text classification, (14 more...)

@machinelearnbot

Country:

North America > United States (0.14)
Europe > Netherlands > South Holland > The Hague (0.05)

Genre: Instructional Material > Course Syllabus & Notes (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)
(2 more...)

#artificialintelligenceOct-26-2016, 06:10:13 GMT

Static & DYNAMICAL Machine Learning – What is the Difference?

In an earlier blog, "Need for DYNAMICAL Machine Learning: Bayesian exact recursive estimation", I introduced the need for Dynamical ML as we now enter the "Walk" stage of "Crawl-Walk-Run" evolution of machine learning. First, I defined Static ML as follows: Given a set of inputs and outputs, find a static map between the two during supervised "Training" and use this static map for business purposes during "Operation". I made the following points using IoT as an example. Dynamical ML solution involves State-Space data model (more below). What more does a Dynamical ML solution offer?

artificial intelligence, machine learning, optimization problem, (11 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.30)