Goto

Collaborating Authors

 Learning Graphical Models


Artificial Intelligence In Music Production: What Does It Mean For Artists? - DJ TechTools

#artificialintelligence

A lot of DJs and music producers are starting to wonder how these technologies could be implemented in their fields. In this article, DJTT's Steven Maude takes deep dive into current AI music projects, and how they could change the process of music creation in the very near future. From language translation, self-driving cars, to beating humans at traditional games or learning to play classic games from the modern era, artificial intelligence (AI) is a big deal in computer science right now. Thanks to the large data stores that, for better or worse, technology giants are collecting, and powerful graphics cards accelerating the math required, we're in a time of rapid progress in diverse fields. The natural question for DJs and producers: what are the possible implications for AI in music? But big technology names have looked at applying artificial intelligence techniques to music creation.


Bayesian Basics, Explained

#artificialintelligence

Editor's note: The following is an interview with Columbia University Professor Andrew Gelman conducted by Marketing scientist Kevin Gray, in which Gelman spells out the ABCs of Bayesian statistics. Andrew Gelman: Bayesian statistics uses the mathematical rules of probability to combines data with "prior information" to give inferences which (if the model being used is correct) are more precise than would be obtained by either source of information alone. Classical statistical methods avoid prior distributions. In classical statistics, you might include in your model a predictor (for example), or you might exclude it, or you might pool it as part of some larger set of predictors in order to get a more stable estimate. These are pretty much your only choices.


Bayesian Differential Privacy through Posterior Sampling

arXiv.org Machine Learning

Differential privacy formalises privacy-preserving mechanisms that provide access to a database. We pose the question of whether Bayesian inference itself can be used directly to provide private access to data, with no modification. The answer is affirmative: under certain conditions on the prior, sampling from the posterior distribution can be used to achieve a desired level of privacy and utility. To do so, we generalise differential privacy to arbitrary dataset metrics, outcome spaces and distribution families. This allows us to also deal with non-i.i.d or non-tabular datasets. We prove bounds on the sensitivity of the posterior to the data, which gives a measure of robustness. We also show how to use posterior sampling to provide differentially private responses to queries, within a decision-theoretic framework. Finally, we provide bounds on the utility and on the distinguishability of datasets. The latter are complemented by a novel use of Le Cam's method to obtain lower bounds. All our general results hold for arbitrary database metrics, including those for the common definition of differential privacy. For specific choices of the metric, we give a number of examples satisfying our assumptions.


BaTFLED: Bayesian Tensor Factorization Linked to External Data

arXiv.org Machine Learning

The vast majority of current machine learning algorithms are designed to predict single responses or a vector of responses, yet many types of response are more naturally organized as matrices or higher-order tensor objects where characteristics are shared across modes. We present a new machine learning algorithm BaTFLED (Bayesian Tensor Factorization Linked to External Data) that predicts values in a three-dimensional response tensor using input features for each of the dimensions. BaTFLED uses a probabilistic Bayesian framework to learn projection matrices mapping input features for each mode into latent representations that multiply to form the response tensor. By utilizing a Tucker decomposition, the model can capture weights for interactions between latent factors for each mode in a small core tensor. Priors that encourage sparsity in the projection matrices and core tensor allow for feature selection and model regularization. This method is shown to far outperform elastic net and neural net models on 'cold start' tasks from data simulated in a three-mode structure. Additionally, we apply the model to predict dose-response curves in a panel of breast cancer cell lines treated with drug compounds that was used as a Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge.


Known Unknowns: Uncertainty Quality in Bayesian Neural Networks

arXiv.org Machine Learning

We evaluate the uncertainty quality in neural networks using anomaly detection. We extract uncertainty measures (e.g. entropy) from the predictions of candidate models, use those measures as features for an anomaly detector, and gauge how well the detector differentiates known from unknown classes. We assign higher uncertainty quality to candidate models that lead to better detectors. We also propose a novel method for sampling a variational approximation of a Bayesian neural network, called One-Sample Bayesian Approximation (OSBA). We experiment on two datasets, MNIST and CIFAR10. We compare the following candidate neural network models: Maximum Likelihood, Bayesian Dropout, OSBA, and --- for MNIST --- the standard variational approximation. We show that Bayesian Dropout and OSBA provide better uncertainty information than Maximum Likelihood, and are essentially equivalent to the standard variational approximation, but much faster.


A State Space Approach for Piecewise-Linear Recurrent Neural Networks for Reconstructing Nonlinear Dynamics from Neural Measurements

arXiv.org Machine Learning

The computational properties of neural systems are often thought to be implemented in terms of their network dynamics. Hence, recovering the system dynamics from experimentally observed neuronal time series, like multiple single-unit (MSU) recordings or neuroimaging data, is an important step toward understanding its computations. Ideally, one would not only seek a state space representation of the dynamics, but would wish to have access to its governing equations for in-depth analysis. Recurrent neural networks (RNNs) are a computationally powerful and dynamically universal formal framework which has been extensively studied from both the computational and the dynamical systems perspective. Here we develop a semi-analytical maximum-likelihood estimation scheme for piecewise-linear RNNs (PLRNNs) within the statistical framework of state space models, which accounts for noise in both the underlying latent dynamics and the observation process. The Expectation-Maximization algorithm is used to infer the latent state distribution, through a global Laplace approximation, and the PLRNN parameters iteratively. After validating the procedure on toy examples, the approach is applied to MSU recordings from the rodent anterior cingulate cortex obtained during performance of a classical working memory task, delayed alternation. A model with 5 states turned out to be sufficient to capture the essential computational dynamics underlying task performance, including stimulus-selective delay activity. The estimated models were rarely multi-stable, but rather were tuned to exhibit slow dynamics in the vicinity of a bifurcation point. In summary, the present work advances a semi-analytical (thus reasonably fast) maximum-likelihood estimation framework for PLRNNs that may enable to recover the relevant dynamics underlying observed neuronal time series, and directly link them to computational properties.


Bayesian Approach for Sales Time Series Forecasting

#artificialintelligence

In our previous post, we showed the examples of using linear models and machine learning approach for forecasting sales time series. Sometimes we need to forecast not only more probable values of sales but also their distribution. Especially we need it in the risk analysis for assessing different risks related to sales dynamics. In this case we need to take into account sales distributions and dependencies between sales time series features (e.g. One can consider sales as a stochastic variable with some marginal distributions.


Latent Tree Models for Hierarchical Topic Detection

arXiv.org Machine Learning

We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variables at other levels are binary latent variables, with those at the lowest latent level representing word co-occurrence patterns and those at higher levels representing co-occurrence of patterns at the level below. Each latent variable gives a soft partition of the documents, and document clusters in the partitions are interpreted as topics. Latent variables at high levels of the hierarchy capture long-range word co-occurrence patterns and hence give thematically more general topics, while those at low levels of the hierarchy capture short-range word co-occurrence patterns and give thematically more specific topics. Unlike LDA-based topic models, HLTMs do not refer to a document generation process and use word variables instead of token variables. They use a tree structure to model the relationships between topics and words, which is conducive to the discovery of meaningful topics and topic hierarchies.


Bayesian Decision Process for Cost-Efficient Dynamic Ranking via Crowdsourcing

arXiv.org Machine Learning

Rank aggregation based on pairwise comparisons over a set of items has a wide range of applications. Although considerable research has been devoted to the development of rank aggregation algorithms, one basic question is how to efficiently collect a large amount of high-quality pairwise comparisons for the ranking purpose. Because of the advent of many crowdsourcing services, a crowd of workers are often hired to conduct pairwise comparisons with a small monetary reward for each pair they compare. Since different workers have different levels of reliability and different pairs have different levels of ambiguity, it is desirable to wisely allocate the limited budget for comparisons among the pairs of items and workers so that the global ranking can be accurately inferred from the comparison results. To this end, we model the active sampling problem in crowdsourced ranking as a Bayesian Markov decision process, which dynamically selects item pairs and workers to improve the ranking accuracy under a budget constraint. We further develop a computationally efficient sampling policy based on knowledge gradient as well as a moment matching technique for posterior approximation. Experimental evaluations on both synthetic and real data show that the proposed policy achieves high ranking accuracy with a lower labeling cost.


Naive Bayes for Dummies; A Simple Explanation

@machinelearnbot

This blog post was originally published as part of an ongoing series, "Popular Algorithms Explained in Simple English" on the AYLIEN Text Analysis Blog. Commonly used in Machine Learning, Naive Bayes is a collection of classification algorithms based on Bayes Theorem. It is not a single algorithm but a family of algorithms that all share a common principle, that every feature being classified is independent of the value of any other feature. So for example, a fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A Naive Bayes classifier considers each of these "features" (red, round, 3" in diameter) to contribute independently to the probability that the fruit is an apple, regardless of any correlations between features.