AITopics

1411.6285

Country: Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Mahani, Alireza S., Sharabiani, Mansour T. A.

SIMD Parallel MCMC Sampling with Applications for Big-Data Bayesian Analytics

arXiv.org Artificial IntelligenceNov-19-2014

Computational intensity and sequential nature of estimation techniques for Bayesian methods in statistics and machine learning, combined with their increasing applications for big data analytics, necessitate both the identification of potential opportunities to parallelize techniques such as MCMC sampling, and the development of general strategies for mapping such parallel algorithms to modern CPUs in order to elicit the performance up the compute-based and/or memory-based hardware limits. Two opportunities for Single-Instruction Multiple-Data (SIMD) parallelization of MCMC sampling for probabilistic graphical models are presented. In exchangeable models with many observations such as Bayesian Generalized Linear Models, child-node contributions to the conditional posterior of each node can be calculated concurrently. In undirected graphs with discrete nodes, concurrent sampling of conditionally-independent nodes can be transformed into a SIMD form. High-performance libraries with multi-threading and vectorization capabilities can be readily applied to such SIMD opportunities to gain decent speedup, while a series of high-level source-code and runtime modifications provide further performance boost by reducing parallelization overhead and increasing data locality for NUMA architectures. For big-data Bayesian GLM graphs, the end-result is a routine for evaluating the conditional posterior and its gradient vector that is 5 times faster than a naive implementation using (built-in) multi-threaded Intel MKL BLAS, and reaches within the striking distance of the memory-bandwidth-induced hardware limit. The proposed optimization strategies improve the scaling of performance with number of cores and width of vector units (applicable to many-core SIMD processors such as Intel Xeon Phi and GPUs), resulting in cost-effectiveness, energy efficiency, and higher speed on multi-core x86 processors.

data mining, machine learning, node, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.csda.2015.02.010

1310.1537

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Huemmer, Christian, Maas, Roland, Kellermann, Walter

The NLMS algorithm with time-variant optimum stepsize derived from a Bayesian network perspective

arXiv.org Machine LearningNov-18-2014

In this article, we derive a new stepsize adaptation for the normalized least mean square algorithm (NLMS) by describing the task of linear acoustic echo cancellation from a Bayesian network perspective. Similar to the well-known Kalman filter equations, we model the acoustic wave propagation from the loudspeaker to the microphone by a latent state vector and define a linear observation equation (to model the relation between the state vector and the observation) as well as a linear process equation (to model the temporal progress of the state vector). Based on additional assumptions on the statistics of the random variables in observation and process equation, we apply the expectation-maximization (EM) algorithm to derive an NLMS-like filter adaptation. By exploiting the conditional independence rules for Bayesian networks, we reveal that the resulting EM-NLMS algorithm has a stepsize update equivalent to the optimal-stepsize calculation proposed by Yamamoto and Kitayama in 1982, which has been adopted in many textbooks. As main difference, the instantaneous stepsize value is estimated in the M step of the EM algorithm (instead of being approximated by artificially extending the acoustic echo path). The EM-NLMS algorithm is experimentally verified for synthesized scenarios with both, white noise and male speech as input signal.

artificial intelligence, bayesian inference, machine learning, (12 more...)

1411.4834

Country: Asia > China (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningNov-17-2014

Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation

Low, Kian Hsiang, Yu, Jiangbo, Chen, Jie, Jaillet, Patrick

The expressive power of a Gaussian process (GP) model comes at a cost of poor scalability in the data size. To improve its scalability, this paper presents a low-rank-cum-Markov approximation (LMA) of the GP model that is novel in leveraging the dual computational advantages stemming from complementing a low-rank approximate representation of the full-rank GP based on a support set of inputs with a Markov approximation of the resulting residual process; the latter approximation is guaranteed to be closest in the Kullback-Leibler distance criterion subject to some constraint and is considerably more refined than that of existing sparse GP models utilizing low-rank representations due to its more relaxed conditional independence assumption (especially with larger data). As a result, our LMA method can trade off between the size of the support set and the order of the Markov property to (a) incur lower computational cost than such sparse GP models while achieving predictive performance comparable to them and (b) accurately represent features/patterns of any scale. Interestingly, varying the Markov order produces a spectrum of LMAs with PIC approximation and full-rank GP at the two extremes. An advantage of our LMA method is that it is amenable to parallelization on multiple machines/cores, thereby gaining greater scalability. Empirical evaluation on three real-world datasets in clusters of up to 32 computing nodes shows that our centralized and parallel LMA methods are significantly more time-efficient and scalable than state-of-the-art sparse and full-rank GP regression methods while achieving comparable predictive performances.

approximation, data mining, machine learning, (17 more...)

1411.451

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Data Science > Data Mining > Big Data (0.40)

Chaves, Rafael, Kueng, Richard, Brask, Jonatan Bohr, Gross, David

A unifying framework for relaxations of the causal assumptions in Bell's theorem

arXiv.org Machine LearningNov-17-2014

Bell's Theorem shows that quantum mechanical correlations can violate the constraints that the causal structure of certain experiments impose on any classical explanation. It is thus natural to ask to which degree the causal assumptions -- e.g. locality or measurement independence -- have to be relaxed in order to allow for a classical description of such experiments. Here, we develop a conceptual and computational framework for treating this problem. We employ the language of Bayesian networks to systematically construct alternative causal structures and bound the degree of relaxation using quantitative measures that originate from the mathematical theory of causality. The main technical insight is that the resulting problems can often be expressed as computationally tractable linear programs. We demonstrate the versatility of the framework by applying it to a variety of scenarios, ranging from relaxations of the measurement independence, locality and bilocality assumptions, to a novel causal interpretation of CHSH inequality violations.

artificial intelligence, correlation, machine learning, (19 more...)

doi: 10.1103/PhysRevLett.114.140403

1411.4648

Country:

Europe > Germany (0.28)
Europe > Switzerland (0.28)
North America > United States (0.28)

Genre: Research Report (0.63)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

arXiv.org Artificial IntelligenceNov-14-2014

rFerns: An Implementation of the Random Ferns Method for General-Purpose Machine Learning

Kursa, Miron B.

Random ferns is a machine learning algorithm proposed by [11] for matching same elements between two images of the same scene, allowing one to recognise certain objects or trace them on videos. The original motivation behind this method was to create a simple and efficient algorithm by extending the Naïve Bayes classifier; still the authors acknowledged its strong connection to the decision tree ensembles like the Random forest [2] algorithm. Since introduction, Random ferns have been applied in numerous computer vision application, like image recognition [1], action recognition [10] or augmented reality [14]. However, it has not gathered attention outside this field; thus, this work aims to bring this algorithm to a much wider spectrum of applications. In order to do that, I propose a generalised version of the algorithm, implemented as an R [13] package rFerns. The paper is organised as follows. Section 2 briefly recalls the Bayesian derivation of the original version of Random ferns, presents the decision tree ensemble interpretation of the algorithm and lists modifications leading to the rFerns variant.

artificial intelligence, fern, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.18637/jss.v061.i10

1202.1121

Country: Europe > Poland > Masovia Province > Warsaw (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Linderman, Scott W., Stock, Christopher H., Adams, Ryan P.

A framework for studying synaptic plasticity with neural spike train data

Synaptic plasticity is believed to be the fundamental building block of learning and memory in the brain. Its study is of crucial importance to understanding the activity and function of neural circuits. With innovations in neural recording technology providing access to the simultaneous activity of increasingly large populations of neurons, statistical models are promising tools for formulating and testing hypotheses about the dynamics of synaptic connectivity. Advances in optical techniques (Packer et al., 2012; Hochbaum et al., 2014), for example, have made it possible to simultaneously record from and stimulate large populations of synaptically connected neurons. Armed with statistical tools capable of inferring time-varying synaptic connectivity, neuroscientists could test competing models of synaptic plasticity, discover new learning rules at the monosynaptic and network level, investigate the effects of disease on synaptic plasticity, and potentially design stimuli to modify neural networks. Despite the popularity of GLMs for spike data, relatively little work has attempted to model the time-varying nature of neural interactions. Here we model interaction weights as a dynamical system governed by parametric synaptic plasticity rules. To perform inference in this model, we use particle Markov Chain Monte Carlo (pMCMC) (Andrieu et al., 2010), a recently developed inference technique for complex time series. We use this new modeling framework to examine the problem of using recorded data to distinguish between proposed variants of spike-timing-dependent plasticity (STDP) learning rules.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1411.4077

Country: North America > United States (0.29)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing

Li, Hongwei, Yu, Bin

Crowdsourcing has become an effective and popular tool for human-powered computation to label large datasets. Since the workers can be unreliable, it is common in crowdsourcing to assign multiple workers to one task, and to aggregate the labels in order to obtain results of high quality. In this paper, we provide finite-sample exponential bounds on the error rate (in probability and in expectation) of general aggregation rules under the Dawid-Skene crowdsourcing model. The bounds are derived for multi-class labeling, and can be used to analyze many aggregation methods, including majority voting, weighted majority voting and the oracle Maximum A Posteriori (MAP) rule. We show that the oracle MAP rule approximately optimizes our upper bound on the mean error rate of weighted majority voting in certain setting. We propose an iterative weighted majority voting (IWMV) method that optimizes the error rate bound and approximates the oracle MAP rule. Its one step version has a provable theoretical guarantee on the error rate. The IWMV method is intuitive and computationally simple. Experimental results on simulated and real data show that IWMV performs at least on par with the state-of-the-art methods, and it has a much lower computational cost (around one hundred times faster) than the state-of-the-art methods.

artificial intelligence, machine learning, social media, (17 more...)

1411.4086

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Jacobs, Abigail Z., Clauset, Aaron

A unified view of generative models for networks: models, methods, opportunities, and challenges

These efforts have produced a diverse ecology of models and methods. Despite this diversity, many of these models share a common underlying structure: pairwise interactions (edges) are generated with probability conditional on latent vertex attributes. Differences between models generally stem from different philosophical choices about how to learn from data or different empirically-motivated goals. The highly interdisciplinary nature of work on these generative models, however, has inhibited the development of a unified view of their similarities and differences. For instance, novel theoretical models and optimization techniques developed in machine learning are largely unknown within the social and biological sciences, which have instead emphasized model interpretability. Here, we describe a unified view of generative models for networks that draws together many of these disparate threads and highlights the fundamental similarities and differences that span these fields. We then describe a number of opportunities and challenges for future work that are revealed by this view.

artificial intelligence, machine learning, natural language, (20 more...)

1411.407

Country:

North America > United States (0.47)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Communications > Networks (0.86)
(3 more...)

Pham, Anh T., Raich, Raviv, Fern, Xiaoli Z.

Dynamic Programming for Instance Annotation in Multi-instance Multi-label Learning

Labeling data for classification requires significant human effort. To reduce labeling cost, instead of labeling every instance, a group of instances (bag) is labeled by a single bag label. Computer algorithms are then used to infer the label for each instance in a bag, a process referred to as instance annotation. This task is challenging due to the ambiguity regarding the instance labels. We propose a discriminative probabilistic model for the instance annotation problem and introduce an expectation maximization framework for inference, based on the maximum likelihood approach. For many probabilistic approaches, brute-force computation of the instance label posterior probability given its bag label is exponential in the number of instances in the bag. Our key contribution is a dynamic programming method for computing the posterior that is linear in the number of instances. We evaluate our methods using both benchmark and real world data sets, in the domain of bird song, image annotation, and activity recognition. In many cases, the proposed framework outperforms, sometimes significantly, the current state-of-the-art MIML learning methods, both in instance label prediction and bag label prediction.

accuracy, artificial intelligence, machine learning, (18 more...)

1411.4068

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)