AITopics

1610.08127

Country:

North America > United States (0.29)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Lum, Kristian, Johndrow, James

A statistical framework for fair predictive algorithms

arXiv.org Machine LearningOct-25-2016

Predictive modeling is increasingly being employed to assist human decision-makers. One purported advantage of replacing human judgment with computer models in high stakes settings-- such as sentencing, hiring, policing, college admissions, and parole decisions-- is the perceived "neutrality" of computers. It is argued that because computer models do not hold personal prejudice, the predictions they produce will be equally free from prejudice. There is growing recognition that employing algorithms does not remove the potential for bias, and can even amplify it, since training data were inevitably generated by a process that is itself biased. In this paper, we provide a probabilistic definition of algorithmic bias. We propose a method to remove bias from predictive models by removing all information regarding protected variables from the permitted training data. Unlike previous work in this area, our framework is general enough to accommodate arbitrary data types, e.g. binary, continuous, etc. Motivated by models currently in use in the criminal justice system that inform decisions on pre-trial release and paroling, we apply our proposed method to a dataset on the criminal histories of individuals at the time of sentencing to produce "race-neutral" predictions of re-arrest. In the process, we demonstrate that the most common approach to creating "race-neutral" models-- omitting race as a covariate-- still results in racially disparate predictions. We then demonstrate that the application of our proposed method to these data removes racial disparities from predictions with minimal impact on predictive accuracy.

artificial intelligence, machine learning, prediction, (18 more...)

1610.08077

Country: North America > United States (0.69)

Genre: Research Report (0.50)

Industry:

Law > Criminal Law (0.89)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.69)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

@machinelearnbotOct-24-2016, 03:15:23 GMT

[Discussion] How Gaussian naïve Bayes forms a non-linear decision boundary? • /r/MachineLearning

Discussion[Discussion] How Gaussian naïve Bayes forms a non-linear decision boundary? Also, please explain decision boundary for decision trees. If the two gaussians are non isotropic you can derive that the bound is quadratic/elliptic curve.

artificial intelligence, machine learning, non-linear decision boundary, (3 more...)

@machinelearnbot

Industry: Media > News (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.80)

Papamakarios, George, Murray, Iain

Fast $\epsilon$-free Inference of Simulation Models with Bayesian Conditional Density Estimation

arXiv.org Machine LearningOct-24-2016

Many statistical models can be simulated forwards but have intractable likelihoods. Approximate Bayesian Computation (ABC) methods are used to infer properties of these models from data. Traditionally these methods approximate the posterior over parameters by conditioning on data being inside an $\epsilon$-ball around the observed data, which is only correct in the limit $\epsilon\!\rightarrow\!0$. Monte Carlo methods can then draw samples from the approximate posterior to approximate predictions or error bars on parameters. These algorithms critically slow down as $\epsilon\!\rightarrow\!0$, and in practice draw samples from a broader distribution than the posterior. We propose a new approach to likelihood-free inference based on Bayesian conditional density estimation. Preliminary inferences based on limited simulation data are used to guide later simulations. In some cases, learning an accurate parametric representation of the entire true posterior distribution requires fewer model simulations than Monte Carlo ABC methods need to produce a single sample from an approximate posterior.

artificial intelligence, machine learning, posterior, (15 more...)

1605.06376

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Yu, Edward, Parekh, Parth

A Bayesian Ensemble for Unsupervised Anomaly Detection

arXiv.org Machine LearningOct-24-2016

Methods for unsupervised anomaly detection suffer from the fact that the data is unlabeled, making it difficult to assess the optimality of detection algorithms. Ensemble learning has shown exceptional results in classification and clustering problems, but has not seen as much research in the context of outlier detection. Existing methods focus on combining output scores of individual detectors, but this leads to outputs that are not easily interpretable. In this paper, we introduce a theoretical foundation for combining individual detectors with Bayesian classifier combination. Not only are posterior distributions easily interpreted as the probability distribution of anomalies, but bias, variance, and individual error rates of detectors are all easily obtained. Performance on real-world datasets shows high accuracy across varied types of time series data.

data mining, detector, machine learning, (19 more...)

1610.07677

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Alquier, Pierre, Guedj, Benjamin

Simpler PAC-Bayesian Bounds for Hostile Data

Learning theory can be traced back to the late 60s and has attracted a great attention since. We refer to the monographs Devroye et al. (1996) and Vapnik (2000) for a survey. Most of the literature addresses the simplified case of i.i.d observations coupled with bounded loss functions. Many bounds on the excess risk holding with large probability were provided - these bounds are refered to as PAC learning bounds since Valiant (1984). In the late 90s, the PAC-Bayesian approach has been pioneered by Shawe-Taylor and Williamson (1997) and McAllester (1998, 1999). It consists in producing PAC bounds for a specific class of Bayesian-flavored estimators. Similarly to classical PAC results, most PAC-Bayesian bounds have been obtained with bounded loss functions (see Catoni, 2007, for some of the most accurate results). Note that Catoni (2004) provides bounds for unbouded loss, but still under very strong exponential moments assumptions. These assumptions were essentially not improved in the most recent works Guedj and Alquier (2013) and Bégin et al. (2016).

artificial intelligence, assumption, machine learning, (18 more...)

1610.07193

Country: North America > United States (0.19)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Obuchi, Tomoyuki, Koma, Hirokazu, Yasuda, Muneki

Boltzmann-Machine Learning of Prior Distributions of Binarized Natural Images

Prior distributions of binarized natural images are learned by using a Boltzmann machine. According the results of this study, there emerges a structure with two sublattices in the interactions, and the nearest-neighbor and next-nearest-neighbor interactions correspondingly take two discriminative values, which reflects the individual characteristics of the three sets of pictures that we process. Meanwhile, in a longer spatial scale, a longer-range, although still rapidly decaying, ferromagnetic interaction commonly appears in all cases. The characteristic length scale of the interactions is universally up to approximately four lattice spacings $\xi \approx 4$. These results are derived by using the mean-field method, which effectively reduces the computational time required in a Boltzmann machine. An improved mean-field method called the Bethe approximation also gives the same results, as well as the Monte Carlo method does for small size images. These reinforce the validity of our analysis and findings. Relations to criticality, frustration, and simple-cell receptive fields are also discussed.

artificial intelligence, interaction, machine learning, (16 more...)

doi: 10.7566/JPSJ.85.114803

1412.7012

Country: Asia > Japan (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Formulas for Counting the Sizes of Markov Equivalence Classes of Directed Acyclic Graphs

He, Yangbo, Yu, Bin

The sizes of Markov equivalence classes of directed acyclic graphs play important roles in measuring the uncertainty and complexity in causal learning. A Markov equivalence class can be represented by an essential graph and its undirected subgraphs determine the size of the class. In this paper, we develop a method to derive the formulas for counting the sizes of Markov equivalence classes. We first introduce a new concept of core graph. The size of a Markov equivalence class of interest is a polynomial of the number of vertices given its core graph. Then, we discuss the recursive and explicit formula of the polynomial, and provide an algorithm to derive the size formula via symbolic computation for any given core graph. The proposed size formula derivation sheds light on the relationships between the size of a Markov equivalence class and its representation graph, and makes size counting efficient, even when the essential graphs contain non-sparse undirected subgraphs.

artificial intelligence, graph, machine learning, (14 more...)

1610.07921

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Petrovici, Mihai A., Bill, Johannes, Bytschok, Ilja, Schemmel, Johannes, Meier, Karlheinz

Stochastic inference with spiking neurons in the high-conductance state

The highly variable dynamics of neocortical circuits observed in vivo have been hypothesized to represent a signature of ongoing stochastic inference but stand in apparent contrast to the deterministic response of neurons measured in vitro. Based on a propagation of the membrane autocorrelation across spike bursts, we provide an analytical derivation of the neural activation function that holds for a large parameter space, including the high-conductance state. On this basis, we show how an ensemble of leaky integrate-and-fire neurons with conductance-based synapses embedded in a spiking environment can attain the correct firing statistics for sampling from a well-defined target distribution. For recurrent networks, we examine convergence toward stationarity in computer simulations and demonstrate sample-based Bayesian inference in a mixed graphical model. This points to a new computational role of high-conductance states and establishes a rigorous link between deterministic neuron models and functional stochastic dynamics on the network level.

artificial intelligence, bayesian inference, machine learning, (18 more...)

doi: 10.1103/PhysRevE.94.042312

1610.07161

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Weiss, Christian, Zoubir, Abdelhak M.

Dictionary Learning Strategies for Compressed Fiber Sensing Using a Probabilistic Sparse Model

arXiv.org Machine LearningOct-21-2016

We present a sparse estimation and dictionary learning framework for compressed fiber sensing based on a probabilistic hierarchical sparse model. To handle severe dictionary coherence, selective shrinkage is achieved using a Weibull prior, which can be related to non-convex optimization with $p$-norm constraints for $0 < p < 1$. In addition, we leverage the specific dictionary structure to promote collective shrinkage based on a local similarity model. This is incorporated in form of a kernel function in the joint prior density of the sparse coefficients, thereby establishing a Markov random field-relation. Approximate inference is accomplished using a hybrid technique that combines Hamilton Monte Carlo and Gibbs sampling. To estimate the dictionary parameter, we pursue two strategies, relying on either a deterministic or a probabilistic model for the dictionary parameter. In the first strategy, the parameter is estimated based on alternating estimation. In the second strategy, it is jointly estimated along with the sparse coefficients. The performance is evaluated in comparison to an existing method in various scenarios using simulations and experimental data.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1610.06902

Country:

Asia > Japan (0.28)
Europe > Germany (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)