Goto

Collaborating Authors

 Directed Networks


A Geek's Guide to Machine Learning and Risk analytics and Decisioning Provenir

#artificialintelligence

The greatest challenge when talking about artificial intelligence/machine learning is actually in understanding what data sets we are looking at, and what model/combination of models to apply. Amazon's Machine Learning offering is one example of an automated process which analyses the data and automatically selects the best model to use in the scenario. Other big players who have similar offerings are IBM Watson, Google and Microsoft. Provenir's clients are continually looking at new and innovative ways to improve their risk decisioning. Traditional banks offering consumer, SME and commercial loans and credit, auto lenders, payment providers and fintech companies are using Provenir technology to help them make faster and better decisions about potential fraud. Integrating artificial intelligence/machine learning capabilities into the risk decisioning process can increase the organization's ability to accurately assess the level of risk in order to detect and prevent fraud. Provenir provides model integration adaptors for machine learning models, including Amazon Machine Learning (AML) that can automatically listen for and label business-defined events, calculate attributes and update machine learning models. By combining Provenir technology with machine learning, organizations can increase both the efficiency and predictive accuracy of their risk decisioning.


Tensor-on-tensor regression

arXiv.org Machine Learning

We propose a framework for the linear prediction of a multi-way array (i.e., a tensor) from another multi-way array of arbitrary dimension, using the contracted tensor product. This framework generalizes several existing approaches, including methods to predict a scalar outcome from a tensor, a matrix from a matrix, or a tensor from a scalar. We describe an approach that exploits the multiway structure of both the predictors and the outcomes by restricting the coefficients to have reduced CP-rank. We propose a general and efficient algorithm for penalized least-squares estimation, which allows for a ridge (L_2) penalty on the coefficients. The objective is shown to give the mode of a Bayesian posterior, which motivates a Gibbs sampling algorithm for inference. We illustrate the approach with an application to facial image data. An R package is available at https://github.com/lockEF/MultiwayRegression .


Preserving Differential Privacy in Convolutional Deep Belief Networks

arXiv.org Machine Learning

The remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built on users' personal and highly sensitive data, e.g., clinical records, user profiles, biomedical images, etc. However, only a few scientific studies on preserving privacy in deep learning have been conducted. In this paper, we focus on developing a private convolutional deep belief network (pCDBN), which essentially is a convolutional deep belief network (CDBN) under differential privacy. Our main idea of enforcing epsilon-differential privacy is to leverage the functional mechanism to perturb the energy-based objective functions of traditional CDBNs, rather than their results. One key contribution of this work is that we propose the use of Chebyshev expansion to derive the approximate polynomial representation of objective functions. Our theoretical analysis shows that we can further derive the sensitivity and error bounds of the approximate polynomial representation. As a result, preserving differential privacy in CDBNs is feasible. We applied our model in a health social network, i.e., YesiWell data, and in a handwriting digit dataset, i.e., MNIST data, for human behavior prediction, human behavior classification, and handwriting digit recognition tasks. Theoretical analysis and rigorous experimental evaluations show that the pCDBN is highly effective. It significantly outperforms existing solutions.


Deep Learning: Definition, Resources, Comparison with Machine Learning

@machinelearnbot

Deep learning is sometimes referred to as the intersection between machine learning and artificial intelligence. It is about designing algorithms that can make robots intelligent, such a face recognition techniques used in drones to detect and target terrorists, or pattern recognition / computer vision algorithms to automatically pilot a plane, a train, a boat or a car. Many deep learning algorithms (clustering, pattern recognition, automated bidding, recommendation engine, and so on) -- even though they appear in new contexts such as IoT or machine to machine communication -- still rely on relatively old-fashioned techniques such as logistic regression, SVM, decision trees, K-NN, naive Bayes, Bayesian modeling, ensembles, random forests, signal processing, filtering, graph theory, gaming theory, and many others. Some are new, such as indexation algorithms to automate digital publishing, improve search engines, or create and manage large catalogs such as Amazon's product listing. As a result, many deep learning practitioners call themselves data scientist, computer scientist, statistician, or sometimes engineer.


Numbers war: How Bayesian vs frequentist statistics influence AI

#artificialintelligence

If you want to develop your ML and AI skills, you will need to pick up some statistics and before you have got more than a few steps down that path you will find (whether you like it or not) that you have entered the Twilight Zone that is the frequentist/Bayesian religious war. I use the term "war" advisedly because war, by definition, has moved beyond debate and discussion. "Religious" because the war is based on belief systems, not information. The frequentist world has been briefly described here. The Bayesian world is described in what follows.


Polya Urn Latent Dirichlet Allocation: a doubly sparse massively parallel sampler

arXiv.org Machine Learning

Latent Dirichlet Allocation (LDA) is a topic model widely used in natural language processing and machine learning. Most approaches to training the model rely on iterative algorithms, which makes it difficult to run LDA on big data sets that are best analyzed in parallel and distributed computational environments. Indeed, current approaches to parallel inference either don't converge to the correct posterior or require storage of large dense matrices in memory. We present a novel sampler that overcomes both problems, and we show that this sampler is faster, both empirically and theoretically, than previous Gibbs samplers for LDA. We do so by employing a novel Pรณlya-Urn-based approximation in the sparse partially collapsed sampler for LDA. We prove that the approximation error vanishes with data size, making our algorithm asymptotically exact, a property of importance for large-scale topic models. In addition, we show, via an explicit example, that - contrary to popular belief in the topic modeling literature - partially collapsed samplers can be more efficient than fully collapsed samplers. We conclude by comparing the performance of our algorithm with that of other approaches on well-known corpora. Keywords: Bayesian inference, Big Data, computational complexity, Gibbs sampling, Latent Dirichlet Allocation, Markov Chain Monte Carlo, natural language processing, parallel and distributed systems, topic models.


An introduction to Support Vector Machines (SVM) MonkeyLearn Blog

@machinelearnbot

You're refining your training set, and maybe you've even tried stuff out using Naive Bayes. But now you're feeling confident in your dataset, and want to take it one step further. Enter Support Vector Machines (SVM): a fast and dependable classification algorithm that performs very well with a limited amount of data. Perhaps you have dug a bit deeper, and ran into terms like linearly separable, kernel trick and kernel functions. The idea behind the SVM algorithm is simple, and applying it to natural language classification doesn't require most of the complicated stuff.


The 10 Algorithms Machine Learning Engineers Need to Know

#artificialintelligence

It is no doubt that the sub-field of machine learning / artificial intelligence has increasingly gained more popularity in the past couple of years. As Big Data is the hottest trend in the tech industry at the moment, machine learning is incredibly powerful to make predictions or calculated suggestions based on large amounts of data. Some of the most common examples of machine learning are Netflix's algorithms to make movie suggestions based on movies you have watched in the past or Amazon's algorithms that recommend books based on books you have bought before. So if you want to learn more about machine learning, how do you start? For me, my first introduction is when I took an Artificial Intelligence class when I was studying abroad in Copenhagen.


Learning Discrete Bayesian Networks from Continuous Data

Journal of Artificial Intelligence Research

Learning Bayesian networks from raw data can help provide insights into the relationships between variables. While real data often contains a mixture of discrete and continuous-valued variables, many Bayesian network structure learning algorithms assume all random variables are discrete. Thus, continuous variables are often discretized when learning a Bayesian network. However, the choice of discretization policy has significant impact on the accuracy, speed, and interpretability of the resulting models. This paper introduces a principled Bayesian discretization method for continuous variables in Bayesian networks with quadratic complexity instead of the cubic complexity of other standard techniques. Empirical demonstrations show that the proposed method is superior to the established minimum description length algorithm. In addition, this paper shows how to incorporate existing methods into the structure learning process to discretize all continuous variables and simultaneously learn Bayesian network structures.


Continuum Limit of Posteriors in Graph Bayesian Inverse Problems

arXiv.org Machine Learning

We consider the problem of recovering a function input of a differential equation formulated on an unknown domain $M$. We assume to have access to a discrete domain $M_n=\{x_1, \dots, x_n\} \subset M$, and to noisy measurements of the output solution at $p\le n$ of those points. We introduce a graph-based Bayesian inverse problem, and show that the graph-posterior measures over functions in $M_n$ converge, in the large $n$ limit, to a posterior over functions in $M$ that solves a Bayesian inverse problem with known domain. The proofs rely on the variational formulation of the Bayesian update, and on a new topology for the study of convergence of measures over functions on point clouds to a measure over functions on the continuum. Our framework, techniques, and results may serve to lay the foundations of robust uncertainty quantification of graph-based tasks in machine learning. The ideas are presented in the concrete setting of recovering the initial condition of the heat equation on an unknown manifold.