AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Modeling Temporal Crowd Work Quality with Limited Supervision

Jung, Hyun Joon (University of Texas at Austin) | Lease, Matthew (University of Texas at Austin)

AAAI ConferencesNov-1-2015

While recent work has shown that a worker’s performance can be more accurately modeled by temporal correlation in task performance, a fundamental challenge remains in the need for expert gold labels to evaluate a worker’s performance. To solve this problem, we explore two methods of utilizing limited gold labels, initial training and periodic updating. Furthermore, we present a novel way of learning a prediction model in the absence of gold labels with uncertaintyaware learning and soft-label updating. Our experiment with a real crowdsourcing dataset demonstrates that periodic updating tends to show better performance than initial training when the number of gold labels are very limited (< 25).

data mining, gold label, machine learning, (16 more...)

AAAI Conferences

Third AAAI Conference on Human Computation and Crowdsourcing

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.72)
Information Technology > Communications (0.68)
Information Technology > Data Science > Data Mining (0.68)
(2 more...)

Add feedback

From "In" to "Over": Behavioral Experiments on Whole-Network Computation

Dworkin, Lili (University of Pennsylvania) | Kearns, Michael (University of Pennsylvania)

AAAI ConferencesNov-1-2015

We report on a series of behavioral experiments in human computation on three different tasks over networks: graph coloring, community detection (or graph clustering), and competitive contagion. While these tasks share similar action spaces and interfaces, they capture a diversity of computational challenges: graph coloring is a search problem, clustering is an optimization problem, and competitive contagion is a game-theoretic problem. In contrast with most of the prior literature on human-subject experiments in networks, in which collectives of subjects are embedded "in" the network, and have only local information and interactions, here individual subjects have a global (or "over") view and must solve "whole network" problems alone. Our primary findings are that subject performance is impressive across all three problem types; that subjects find diverse and novel strategies for solving each task; and that collective performance can often be strongly correlated with known algorithms.

artificial intelligence, graph, machine learning, (19 more...)

AAAI Conferences

Third AAAI Conference on Human Computation and Crowdsourcing

Country:

North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Industry:

Education (0.46)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

A Complete Recipe for Stochastic Gradient MCMC

Ma, Yi-An, Chen, Tianqi, Fox, Emily B.

arXiv.org Machine LearningOct-31-2015

Many recent Markov chain Monte Carlo (MCMC) samplers leverage continuous dynamics to define a transition kernel that efficiently explores a target distribution. In tandem, a focus has been on devising scalable variants that subsample the data and use stochastic gradients in place of full-data gradients in the dynamic simulations. However, such stochastic gradient MCMC samplers have lagged behind their full-data counterparts in terms of the complexity of dynamics considered since proving convergence in the presence of the stochastic gradient noise is non-trivial. Even with simple dynamics, significant physical intuition is often required to modify the dynamical system to account for the stochastic gradient noise. In this paper, we provide a general recipe for constructing MCMC samplers--including stochastic gradient versions--based on continuous Markov processes specified via two matrices. We constructively prove that the framework is complete. That is, any continuous Markov process that provides samples from the target distribution can be written in our framework. We show how previous continuous-dynamic samplers can be trivially "reinvented" in our framework, avoiding the complicated sampler-specific proofs. We likewise use our recipe to straightforwardly propose a new state-adaptive sampler: stochastic gradient Riemann Hamiltonian Monte Carlo (SGRHMC). Our experiments on simulated data and a streaming Wikipedia analysis demonstrate that the proposed SGRHMC sampler inherits the benefits of Riemann HMC, with the scalability of stochastic gradient methods.

artificial intelligence, machine learning, sampler, (14 more...)

arXiv.org Machine Learning

1506.04696

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.76)

Add feedback

Copula variational inference

Tran, Dustin, Blei, David M., Airoldi, Edoardo M.

arXiv.org Machine LearningOct-31-2015

We develop a general variational inference method that preserves dependency among the latent variables. Our method uses copulas to augment the families of distributions used in mean-field and structured approximations. Copulas model the dependency that is not captured by the original variational distribution, and thus the augmented variational family guarantees better approximations to the posterior. With stochastic optimization, inference on the augmented distribution is scalable. Furthermore, our strategy is generic: it can be applied to any inference procedure that currently uses the mean-field or structured approach. Copula variational inference has many advantages: it reduces bias; it is less sensitive to local optima; it is less sensitive to hyperparameters; and it helps characterize and interpret the dependency among the latent variables.

artificial intelligence, copula, machine learning, (18 more...)

arXiv.org Machine Learning

1506.03159

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)

Add feedback

Blitzkriging: Kronecker-structured Stochastic Gaussian Processes

Nickson, Thomas, Gunter, Tom, Lloyd, Chris, Osborne, Michael A, Roberts, Stephen

arXiv.org Machine LearningOct-31-2015

We present Blitzkriging, a new approach to fast inference for Gaussian processes, applicable to regression, optimisation and classification. State-of-the-art (stochastic) inference for Gaussian processes on very large datasets scales cubically in the number of 'inducing inputs', variables introduced to factorise the model. Blitzkriging shares state-of-the-art scaling with data, but reduces the scaling in the number of inducing points to approximately linear. Further, in contrast to other methods, Blitzkriging: does not force the data to conform to any particular structure (including grid-like); reduces reliance on error-prone optimisation of inducing point locations; and is able to learn rich (covariance) structure from the data. We demonstrate the benefits of our approach on real data in regression, time-series prediction and signal-interpolation experiments.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

1510.07965

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Modeling & Simulation (0.85)

Add feedback

Clustering With Side Information: From a Probabilistic Model to a Deterministic Algorithm

Khashabi, Daniel, Wieting, John, Liu, Jeffrey Yufei, Liang, Feng

arXiv.org Machine LearningOct-31-2015

In this paper, we propose a model-based clustering method (TVClust) that robustly incorporates noisy side information as soft-constraints and aims to seek a consensus between side information and the observed data. Our method is based on a nonparametric Bayesian hierarchical model that combines the probabilistic model for the data instance and the one for the side-information. An efficient Gibbs sampling algorithm is proposed for posterior inference. Using the small-variance asymptotics of our probabilistic model, we then derive a new deterministic clustering algorithm (RDP-means). It can be viewed as an extension of K-means that allows for the inclusion of side information and has the additional property that the number of clusters does not need to be specified a priori. Empirical studies have been carried out to compare our work with many constrained clustering algorithms from the literature on both a variety of data sets and under a variety of conditions such as using noisy side information and erroneous k values. The results of our experiments show strong results for our probabilistic and deterministic approaches under these conditions when compared to other algorithms in the literature.

artificial intelligence, machine learning, side information, (14 more...)

arXiv.org Machine Learning

1508.06235

Country: North America > United States > Illinois > Champaign County > Urbana (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient

Gulcehre, Caglar, Moczulski, Marcin, Bengio, Yoshua

arXiv.org Machine LearningOct-31-2015

Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the noise in stochastic estimates of the gradients. In this paper, we propose a new adaptive learning rate algorithm, which utilizes curvature information for automatically tuning the learning rates. The information about the element-wise curvature of the loss function is estimated from the local statistics of the stochastic first order gradients. We further propose a new variance reduction technique to speed up the convergence. In our preliminary experiments with deep neural networks, we obtained better performance compared to the popular stochastic gradient algorithms.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1412.7419

Genre: Research Report (0.82)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Sparse Variational Bayesian Approximations for Nonlinear Inverse Problems: applications in nonlinear elastography

Franck, Isabell M., Koutsourelakis, P. S.

arXiv.org Machine LearningOct-30-2015

This paper presents an efficient Bayesian framework for solving nonlinear, high-dimensional model calibration problems. It is based on a Variational Bayesian formulation that aims at approximating the exact posterior by means of solving an optimization problem over an appropriately selected family of distributions. The goal is two-fold. Firstly, to find lower-dimensional representations of the unknown parameter vector that capture as much as possible of the associated posterior density, and secondly to enable the computation of the approximate posterior density with as few forward calls as possible. We discuss how these objectives can be achieved by using a fully Bayesian argumentation and employing the marginal likelihood or evidence as the ultimate model validation metric for any proposed dimensionality reduction. We demonstrate the performance of the proposed methodology for problems in nonlinear elastography where the identification of the mechanical properties of biological materials can inform non-invasive, medical diagnosis. An Importance Sampling scheme is finally employed in order to validate the results and assess the efficacy of the approximations provided.

equation, optimization problem, upstream oil & gas, (23 more...)

arXiv.org Machine Learning

1412.0473

Country:

Asia > Middle East (0.14)
Europe > Germany (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Maximum Likelihood Learning With Arbitrary Treewidth via Fast-Mixing Parameter Sets

Domke, Justin

arXiv.org Machine LearningOct-30-2015

Inference is typically intractable in high-treewidth undirected graphical models, making maximum likelihood learning a challenge. One way to overcome this is to restrict parameters to a tractable set, most typically the set of tree-structured parameters. This paper explores an alternative notion of a tractable set, namely a set of "fast-mixing parameters" where Markov chain Monte Carlo (MCMC) inference can be guaranteed to quickly converge to the stationary distribution. While it is common in practice to approximate the likelihood gradient using samples obtained from MCMC, such procedures lack theoretical guarantees. This paper proves that for any exponential family with bounded sufficient statistics, (not just graphical models) when parameters are constrained to a fast-mixing set, gradient descent with gradients approximated by sampling will approximate the maximum likelihood solution inside the set with high-probability. When unregularized, to find a solution epsilon-accurate in log-likelihood requires a total amount of effort cubic in 1/epsilon, disregarding logarithmic factors. When ridge-regularized, strong convexity allows a solution epsilon-accurate in parameter distance with effort quadratic in 1/epsilon. Both of these provide of a fully-polynomial time randomized approximation scheme.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1509.08992

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Gaussian Process Random Fields

Moore, David A., Russell, Stuart J.

arXiv.org Machine LearningOct-30-2015

Gaussian processes have been successful in both supervised and unsupervised machine learning tasks, but their computational complexity has constrained practical applications. We introduce a new approximation for large-scale Gaussian processes, the Gaussian Process Random Field (GPRF), in which local GPs are coupled via pairwise potentials. The GPRF likelihood is a simple, tractable, and parallelizeable approximation to the full GP marginal likelihood, enabling latent variable modeling and hyperparameter selection on large datasets. We demonstrate its effectiveness on synthetic spatial data as well as a real-world application to seismic event location.

approximation, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

1511.00054

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback