Goto

Collaborating Authors

 Uncertainty


Fast Counting in Machine Learning Applications

arXiv.org Machine Learning

We propose scalable methods to execute counting queries in machine learning applications. To achieve memory and computational efficiency, we abstract counting queries and their context such that the counts can be aggregated as a stream. We demonstrate performance and scalability of the resulting approach on random queries, and through extensive experimentation using Bayesian networks learning and association rule mining. Our methods significantly outperform commonly used ADtrees and hash tables, and are practical alternatives for processing large-scale data.


Fast Gaussian Process Based Gradient Matching for Parameter Identification in Systems of Nonlinear ODEs

arXiv.org Machine Learning

Parameter identification and comparison of dynamical systems is a challenging task in many fields. Bayesian approaches based on Gaussian process regression over time-series data have been successfully applied to infer the parameters of a dynamical system without explicitly solving it. While the benefits in computational cost are well established, a rigorous mathematical framework has been missing. We offer a novel interpretation which leads to a better understanding and improvements in state-of-the-art performance in terms of accuracy for nonlinear dynamical systems.


Solving Bongard Problems with a Visual Language and Pragmatic Reasoning

arXiv.org Artificial Intelligence

More than 50 years ago Bongard introduced 100 visual concept learning problems as a testbed for intelligent vision systems. These problems are now known as Bongard problems. Although they are well known in the cognitive science and AI communities only moderate progress has been made towards building systems that can solve a substantial subset of them. In the system presented here, visual features are extracted through image processing and then translated into a symbolic visual vocabulary. We introduce a formal language that allows representing complex visual concepts based on this vocabulary. Using this language and Bayesian inference, complex visual concepts can be induced from the examples that are provided in each Bongard problem. Contrary to other concept learning problems the examples from which concepts are induced are not random in Bongard problems, instead they are carefully chosen to communicate the concept, hence requiring pragmatic reasoning. Taking pragmatic reasoning into account we find good agreement between the concepts with high posterior probability and the solutions formulated by Bongard himself. While this approach is far from solving all Bongard problems, it solves the biggest fraction yet.


The Data Science View: Can Simplicity Win Over Complexity?

@machinelearnbot

Paula Parpart's research explores why sometimes simpler algorithms can outperform more complex algorithms. Since the 1970s, a rare point of agreement between Nobel Laureate Daniel Kahneman and prominent Max Planck director Gerd Gigerenzer has been that decision heuristics are an alternative to Bayesian rationality. In cognitive science and psychology, heuristics are decision making algorithms that follow a set of simple rules and deliberately ignore information in the input data. For example, when making real-world decisions such as choosing which coffee to buy or choosing which apartment to rent, there are potentially thousands of features that could play into the decision, but we usually do not have the time or memory capacity to use them all. In choosing between two apartments, instead of considering all available information sources such as proximity to work, proximity to schools, crime rates, neighbourhood sport facilities or market trends, a simple heuristic called "Take-The-Best" (Gigerenzer & Goldstein, 1996) would just rely on the first most important cue that is able to discriminate among the apartments, and ignore all other cues.


Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems

arXiv.org Machine Learning

Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs), which are flexible in programming but energy-consuming. Towards low-energy implementations, this paper investigates their training on an emerging hardware technology called the neuromorphic multi-chip systems (NMSs). NMSs are very effective for a family of algorithms called spiking neural networks (SNNs). We present three SNNs to train topic models. The first SNN is a batch algorithm combining the conventional collapsed Gibbs sampling (CGS) algorithm and an inference SNN to train LDA. The other two SNNs are online algorithms targeting at both energy- and storage-limited environments. The two online algorithms are equivalent with training LDA by using maximum-a-posterior estimation and maximizing the semi-collapsed likelihood, respectively. They use novel, tailored ordinary differential equations for stochastic optimization. We simulate the new algorithms and show that they are comparable with the GPC algorithms, while being suitable for NMS implementation. We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.


CoT: Cooperative Training for Generative Modeling

arXiv.org Machine Learning

We propose Cooperative Training (CoT) for training generative models that measure a tractable density function for target data. CoT coordinately trains a generator $G$ and an auxiliary predictive mediator $M$. The training target of $M$ is to estimate a mixture density of the learned distribution $G$ and the target distribution $P$, and that of $G$ is to minimize the Jensen-Shannon divergence estimated through $M$. CoT achieves independent success without the necessity of pre-training via Maximum Likelihood Estimation or involving high-variance algorithms like REINFORCE. This low-variance algorithm is theoretically proved to be unbiased for both generative and predictive tasks. We also theoretically and empirically show the superiority of CoT over most previous algorithms, in terms of generative quality and diversity, predictive generalization ability and computational cost.


Multimodal Sparse Bayesian Dictionary Learning

arXiv.org Machine Learning

The purpose of this paper is to address the problem of learning dictionaries for multimodal datasets, i.e. datasets collected from multiple data sources. We present an algorithm called multimodal sparse Bayesian dictionary learning (MSBDL). The MSBDL algorithm is able to leverage information from all available data modalities through a joint sparsity constraint on each modality's sparse codes without restricting the coefficients themselves to be equal. Our framework offers a considerable amount of flexibility to practitioners and addresses many of the shortcomings of existing multimodal dictionary learning approaches. Unlike existing approaches, MSBDL allows the dictionaries for each data modality to have different cardinality. In addition, MSBDL can be used in numerous scenarios, from small datasets to extensive datasets with large dimensionality. MSBDL can also be used in supervised settings and allows for learning multimodal dictionaries concurrently with classifiers for each modality.


A review of possible effects of cognitive biases on interpretation of rule-based machine learning models

arXiv.org Machine Learning

This paper investigates to what extent do cognitive biases affect human understanding of interpretable machine learning models, in particular of rules discovered from data. Twenty cognitive biases (illusions, effects) are covered, as are possibly effective debiasing techniques that can be adopted by designers of machine learning algorithms and software. While there seems no universal approach for eliminating all the identified cognitive biases, it follows from our analysis that the effect of most biases can be ameliorated by making rule-based models more concise. Due to lack of previous research, our review transfers general results obtained in cognitive psychology to the domain of machine learning. It needs to be succeeded by empirical studies specifically aimed at the machine learning domain.


Artificial Intelligence #3:kNN & Bayes Classification method

@machinelearnbot

In this Course you learn k-Nearest Neighbors & Naive Bayes Classification Methods. In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression. The k-NN algorithm is among the simplest of all machine learning algorithms. For classification, a useful technique can be to assign weight to the contributions of the neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. The neighbors are taken from a set of objects for which the class (for k-NN classification).


Building Function Approximators on top of Haar Scattering Networks

arXiv.org Machine Learning

The field of artificial neural networks has exploded during the 1980s due to its universal approximation capabilities, as can be seen in [1], but the lack of understanding of the underlying statistical and geometric features extracted from the analyzed signal discouraged significantly its usage among scientists and researchers, as can be seen in [2-3]. Since then, most of its usage has been relegated to applications where such understanding can be neglected, such as computer vision, nonlinear statespace estimators and other tasks related to control where exact algorithmic approaches are unknown or too difficult to implement, according to [3]. More recently, aiming to enlightening these black-boxes, several approaches have been under heavy development, such as variables contributions in the feed forward structure [4], visualization using saliency maps [5], generation of skeletal structures [6], fuzzy rule based evaluation of all permutations [3], extraction of functional relations using sensitivity analysis of input data [7], as many others. In a parallel way, other researchers have been successfully developing new kinds of feed-forward neural architectures that behave much more like a transparent box, where the extracted features can be directly evaluated and understood. Convolutional Neural Networks are a great example of such achievements, as can be seen in [8-10]. Despite its several layers, they can be employed on different types of tasks, including text classification, natural language processing, computer vision and so on, with a good understanding of what is happening behind the curtains. Manuscript received January 15, 2018. This work was supported in part by the FIPE (Institute of Economic Research Foundation) by means of a postdoctoral scholarship.