AITopics

1806.07685

Country:

Asia > China > Fujian Province > Fuzhou (0.04)
North America > Canada > Ontario > Niagara Region > St. Catharines (0.04)
Europe > United Kingdom > Northern Ireland > County Antrim > Newtownabbey (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.51)

Baker, Jack, Fearnhead, Paul, Fox, Emily B, Nemeth, Christopher

Large-Scale Stochastic Sampling from the Probability Simplex

arXiv.org Machine LearningJun-19-2018

Emily B. Fox Department of Statistics University of Washington Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space, such as the simplex, the time-discretisation error can dominate when we are near the boundary of the space. We demonstrate that while current SGMCMC methods for the simplex perform well in certain cases, they struggle with sparse simplex spaces; when many of the components are close to zero. However, most popular large-scale applications of Bayesian inference on simplex spaces, such as network or topic models, are sparse. We argue that this poor performance is due to the biases of SGMCMC caused by the discretization error. To get around this, we propose the stochastic CIR process, which removes all discretization error and we prove that samples from the stochastic CIR process are asymptotically unbiased. Use of the stochastic CIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.

cir process, machine learning, natural language, (17 more...)

1806.07137

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Srivastava, Sanvesh, DePalma, Glen, Liu, Chuanhai

An Asynchronous Distributed Expectation Maximization Algorithm For Massive Data: The DEM Algorithm

arXiv.org Machine LearningJun-19-2018

The family of Expectation-Maximization (EM) algorithms provides a general approach to fitting flexible models for large and complex data. The expectation (E) step of EM-type algorithms is time-consuming in massive data applications because it requires multiple passes through the full data. We address this problem by proposing an asynchronous and distributed generalization of the EM called the Distributed EM (DEM). Using DEM, existing EM-type algorithms are easily extended to massive data settings by exploiting the divide-and-conquer technique and widely available computing power, such as grid computing. The DEM algorithm reserves two groups of computing processes called \emph{workers} and \emph{managers} for performing the E step and the maximization step (M step), respectively. The samples are randomly partitioned into a large number of disjoint subsets and are stored on the worker processes. The E step of DEM algorithm is performed in parallel on all the workers, and every worker communicates its results to the managers at the end of local E step. The managers perform the M step after they have received results from a $\gamma$-fraction of the workers, where $\gamma$ is a fixed constant in $(0, 1]$. The sequence of parameter estimates generated by the DEM algorithm retains the attractive properties of EM: convergence of the sequence of parameter estimates to a local mode and linear global rate of convergence. Across diverse simulations focused on linear mixed-effects models, the DEM algorithm is significantly faster than competing EM-type algorithms while having a similar accuracy. The DEM algorithm maintains its superior empirical performance on a movie ratings database consisting of 10 million ratings.

artificial intelligence, bayesian inference, machine learning, (20 more...)

1806.07533

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Media > Film (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

De Nittis, Giuseppe, Gatti, Nicola

How to Maximize the Spread of Social Influence: A Survey

arXiv.org Artificial IntelligenceJun-19-2018

This survey presents the main results achieved for the influence maximization problem in social networks. This problem is well studied in the literature and, thanks to its recent applications, some of which currently deployed on the field, it is receiving more and more attention in the scientific community. The problem can be formulated as follows: given a graph, with each node having a certain probability of influencing its neighbors, select a subset of vertices so that the number of nodes in the network that are influenced is maximized. Starting from this model, we introduce the main theoretical developments and computational results that have been achieved, taking into account different diffusion models describing how the information spreads throughout the network, various ways in which the sources of information could be placed, and how to tackle the problem in the presence of uncertainties affecting the network. Finally, we present one of the main application that has been developed and deployed exploiting tools and techniques previously discussed.

data mining, machine learning, node, (15 more...)

1806.07757

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Greece (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(4 more...)

Tikka, Santtu, Karvanen, Juha

Simplifying Probabilistic Expressions in Causal Inference

arXiv.org Artificial IntelligenceJun-19-2018

Obtaining a non-parametric expression for an interventional distribution is one of the most fundamental tasks in causal inference. Such an expression can be obtained for an identifiable causal effect by an algorithm or by manual application of do-calculus. Often we are left with a complicated expression which can lead to biased or inefficient estimates when missing data or measurement errors are involved. We present an automatic simplification algorithm that seeks to eliminate symbolically unnecessary variables from these expressions by taking advantage of the structure of the underlying graphical model. Our method is applicable to all causal effect formulas and is readily available in the R package causaleffect.

artificial intelligence, expression, machine learning, (17 more...)

1806.07082

Country:

North America > United States > New York (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Finland > Central Finland > Jyväskylä (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Li, Chang, de Rijke, Maarten

Incremental Sparse Bayesian Ordinal Regression

Ordinal Regression (OR) aims to model the ordering information between different data categories, which is a crucial topic in multi-label learning. An important class of approaches to OR models the problem as a linear combination of basis functions that map features to a high dimensional non-linear space. However, most of the basis function-based algorithms are time consuming. We propose an incremental sparse Bayesian approach to OR tasks and introduce an algorithm to sequentially learn the relevant basis functions in the ordinal scenario. Our method, called Incremental Sparse Bayesian Ordinal Regression (ISBOR), automatically optimizes the hyper-parameters via the type-II maximum likelihood method. By exploiting fast marginal likelihood optimization, ISBOR can avoid big matrix inverses, which is the main bottleneck in applying basis function-based algorithms to OR tasks on large-scale datasets. We show that ISBOR can make accurate predictions with parsimonious basis functions while offering automatic estimates of the prediction uncertainty. Extensive experiments on synthetic and real word datasets demonstrate the efficiency and effectiveness of ISBOR compared to other basis function-based OR approaches.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1806.06553

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Lorenzi, Marco, Filippone, Maurizio

Constraining the Dynamics of Deep Probabilistic Models

We introduce a novel generative formulation of deep probabilistic models implementing "soft" constraints on their function dynamics. In particular, we develop a flexible methodological framework where the modeled functions and derivatives of a given order are subject to inequality or equality constraints. We then characterize the posterior distribution over model and constraint parameters through stochastic variational inference. As a result, the proposed approach allows for accurate and scalable uncertainty quantification on the predictions and on all parameters. We demonstrate the application of equality constraints in the challenging problem of parameter inference in ordinary differential equation models, while we showcase the application of inequality constraints on the problem of monotonic regression of count data. The proposed approach is extensively tested in several experimental settings, leading to highly competitive results in challenging modeling applications, while offering high expressiveness, flexibility and scalability.

artificial intelligence, constraint, machine learning, (17 more...)

1802.0568

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Arizona > Maricopa County > Scottsdale (0.14)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Arora, Saurabh, Doshi, Prashant

A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress

Inverse reinforcement learning is the problem of inferring the reward function of an observed agent, given its policy or behavior. Researchers perceive IRL both as a problem and as a class of methods. By categorically surveying the current literature in IRL, this article serves as a reference for researchers and practitioners in machine learning to understand the challenges of IRL and select the approaches best suited for the problem on hand. The survey formally introduces the IRL problem along with its central challenges which include accurate inference, generalizability, correctness of prior knowledge, and growth in solution complexity with problem size. The article elaborates how the current methods mitigate these challenges. We further discuss the extensions of traditional IRL methods: (i) inaccurate and incomplete perception, (ii) incomplete model, (iii) multiple rewards, and (iv) non-linear reward functions. This discussion concludes with some broad advances in the research area and currently open research questions.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1806.06877

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Georgia > Clarke County > Athens (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(8 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation (0.68)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Godard, Pierre, Zanon-Boito, Marcely, Ondel, Lucas, Berard, Alexandre, Yvon, François, Villavicencio, Aline, Besacier, Laurent

Unsupervised Word Segmentation from Speech with Attention

arXiv.org Artificial IntelligenceJun-18-2018

We present a first attempt to perform attentional word segmentation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-phones that is segmented using neural soft-alignments produced by a neural machine translation model. Evaluation uses an actual Bantu UL, Mboshi; comparisons to monolingual and bilingual baselines illustrate the potential of attentional word segmentation for language documentation.

machine learning, natural language, segmentation, (19 more...)

1806.06734

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
(11 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Havasi, Marton, Hernández-Lobato, José Miguel, Murillo-Fuentes, Juan José

Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo

Deep Gaussian Processes (DGPs) are hierarchical generalizations of Gaussian Processes that combine well calibrated uncertainty estimates with the high flexibility of multilayer models. One of the biggest challenges with these models is that exact inference is intractable. The current state-of-the-art inference method, Variational Inference (VI), employs a Gaussian approximation to the posterior distribution. This can be a potentially poor unimodal approximation of the generally multimodal posterior. In this work, we provide evidence for the non-Gaussian nature of the posterior and we apply the Stochastic Gradient Hamiltonian Monte Carlo method to directly sample from it. To efficiently optimize the hyperparameters, we introduce the Moving Window MCEM algorithm. This results in significantly better predictions at a lower computational cost than its VI counterpart. Thus our method establishes a new state-of-the-art for inference in DGPs.

artificial intelligence, dataset, machine learning, (15 more...)

1806.0549

Genre: Research Report (0.82)

Industry: Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)