AITopics

1703.0291

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Dermatology (0.50)
Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.83)

Bouveyron, Charles, Latouche, Pierre, Mattei, Pierre-Alexandre

Exact Dimensionality Selection for Bayesian PCA

arXiv.org Machine LearningMar-8-2017

We present a Bayesian model selection approach to estimate the intrinsic dimensionality of a high-dimensional dataset. To this end, we introduce a novel formulation of the probabilisitic principal component analysis model based on a normal-gamma prior distribution. In this context, we exhibit a closed-form expression of the marginal likelihood which allows to infer an optimal number of components. We also propose a heuristic based on the expected shape of the marginal likelihood curve in order to choose the hyperparameters. In non-asymptotic frameworks, we show on simulated data that this exact dimensionality selection approach is competitive with both Bayesian and frequentist state-of-the-art methods.

artificial intelligence, likelihood, machine learning, (13 more...)

1703.02834

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Feng, Yunlong, Fan, Jun, Suykens, Johan A. K.

A Statistical Learning Approach to Modal Regression

arXiv.org Machine LearningMar-8-2017

This paper studies the nonparametric modal regression problem systematically from a statistical learning view. Originally motivated by pursuing a theoretical understanding of the maximum correntropy criterion based regression (MCCR), our study reveals that MCCR with a tending-to-zero scale parameter is essentially modal regression. We show that nonparametric modal regression problem can be approached via the classical empirical risk minimization. Some efforts are then made to develop a framework for analyzing and implementing modal regression. For instance, the modal regression function is described, the modal regression risk is defined explicitly and its \textit{Bayes} rule is characterized; for the sake of computational tractability, the surrogate modal regression risk, which is termed as the generalization risk in our study, is introduced. On the theoretical side, the excess modal regression risk, the excess generalization risk, the function estimation error, and the relations among the above three quantities are studied rigorously. It turns out that under mild conditions, function estimation consistency and convergence may be pursued in modal regression as in vanilla regression protocols, such as mean regression, median regression, and quantile regression. However, it outperforms these regression models in terms of robustness as shown in our study from a re-descending M-estimation view. This coincides with and in return explains the merits of MCCR on robustness. On the practical side, the implementation issues of modal regression including the computational algorithm and the tuning parameters selection are discussed. Numerical assessments on modal regression are also conducted to verify our findings empirically.

artificial intelligence, machine learning, regression, (16 more...)

1702.0596

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Ramazzotti, Daniele, Graudenzi, Alex, Caravagna, Giulio, Antoniotti, Marco

Modeling cumulative biological phenomena with Suppes-Bayes Causal Networks

arXiv.org Artificial IntelligenceMar-8-2017

Noname manuscript No. (will be inserted by the editor) Abstract Several diseases related to cell proliferation are characterized by the accumulation of somatic DNA changes, with respect to wildtype conditions. Cancer and HIV are two common examples of such diseases, where the mutational load in the cancerous/viral population increases over time. In these cases, selective pressures are often observed along with competition, cooperation and parasitism among distinct cellular clones. Recently, we presented a mathematical framework to model these phenomena, based on a combination of Bayesian inference and Suppes' theory of probabilistic causation, depicted in graphical structures dubbed Suppes-Bayes Causal Networks ( SBCNs). SBCNs are generative probabilistic graphical models that recapitulate the potential ordering of accumulation of such DNA changes during the progression of the disease. Such models can be inferred from data by exploiting likelihood-based model-selection strategies with regularization. In this paper we discuss the theoretical foundations of our approach and we investigate in depth the influence on the model-selection task of: (i) the poset based on Suppes' theory and (ii) different regularization strategies. Furthermore, we provide an example of application of our framework to HIV genetic data high-Daniele Ramazzotti Department of Pathology, Stanford University, Stanford, CA 94305, USA Email: daniele.ramazzotti@stanford.edu Alex Graudenzi Department of Informatics, Systems and Communication, University of Milan-Bicocca, Milan, Italy Giulio Caravagna School of Informatics, University of Edinburgh, Edinburgh, UK Marco Antoniotti Department of Informatics, Systems and Communication, University of Milan-Bicocca, Milan, Italy Keywords Cumulative Phenomena· Bayesian Graphical Models· Probabilistic Causality 1 Introduction A number of diseases are characterized by the accumulation of genomic lesions in the DNA of a population of cells. Such lesions are often classified as mutations, if they involve one or few nucleotides, or chromosomal alterations, if they involve wider regions of a chromosome.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1602.07857

Country:

Europe > Italy > Lombardy > Milan (0.44)
North America > United States > California > Santa Clara County > Stanford (0.24)
North America > United States > California > Santa Clara County > Palo Alto (0.24)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Internal Medicine (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Bonchi, Francesco, Hajian, Sara, Mishra, Bud, Ramazzotti, Daniele

Exposing the Probabilistic Causal Structure of Discrimination

arXiv.org Artificial IntelligenceMar-8-2017

Discrimination discovery from data is an important task aiming at identifying patterns of illegal and unethical discriminatory activities against protected-by-law groups, e.g., ethnic minorities. While any legally-valid proof of discrimination requires evidence of causality, the state-of-the-art methods are essentially correlation-based, albeit, as it is well known, correlation does not imply causation. In this paper we take a principled causal approach to the data mining problem of discrimination detection in databases. Following Suppes' probabilistic causation theory, we define a method to extract, from a dataset of historical decision records, the causal structures existing among the attributes in the data. The result is a type of constrained Bayesian network, which we dub Suppes-Bayes Causal Network (SBCN). Next, we develop a toolkit of methods based on random walks on top of the SBCN, addressing different anti-discrimination legal concepts, such as direct and indirect discrimination, group and individual discrimination, genuine requirement, and favoritism. Our experiments on real-world datasets confirm the inferential power of our approach in all these different tasks.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s41060-016-0040-z

1510.00552

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.14)

Genre: Research Report (0.84)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Law > Labor & Employment Law (0.67)
Information Technology > Security & Privacy (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Deep Probabilistic Programming

Tran, Dustin, Hoffman, Matthew D., Saurous, Rif A., Brevdo, Eugene, Murphy, Kevin, Blei, David M.

We propose Edward, a Turing-complete probabilistic programming language. Edward defines two compositional representations---random variables and inference. By treating inference as a first class citizen, on a par with modeling, we show that probabilistic programming can be as flexible and computationally efficient as traditional deep learning. For flexibility, Edward makes it easy to fit the same model using a variety of composable inference methods, ranging from point estimation to variational inference to MCMC. In addition, Edward can reuse the modeling representation as part of inference, facilitating the design of rich variational models and generative adversarial networks. For efficiency, Edward is integrated into TensorFlow, providing significant speedups over existing probabilistic systems. For example, we show on a benchmark logistic regression task that Edward is at least 35x faster than Stan and 6x faster than PyMC3. Further, Edward incurs no runtime overhead: it is as fast as handwritten TensorFlow.

inference, sigma tf, variational inference, (14 more...)

1701.03757

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Steorts, Rebecca C., Barnes, Matt, Neiswanger, Willie

Performance Bounds for Graphical Record Linkage

Record linkage involves merging records in large, noisy databases to remove duplicate entities. It has become an important area because of its widespread occurrence in bibliometrics, public health, official statistics production, political science, and beyond. Traditional linkage methods directly linking records to one another are computationally infeasible as the number of records grows. As a result, it is increasingly common for researchers to treat record linkage as a clustering task, in which each latent entity is associated with one or more noisy database records. We critically assess performance bounds using the Kullback-Leibler (KL) divergence under a Bayesian record linkage framework, making connections to Kolchin partition models. We provide an upper bound using the KL divergence and a lower bound on the minimum probability of misclassifying a latent entity. We give insights for when our bounds hold using simulated data and provide practical user guidance.

artificial intelligence, latent entity, machine learning, (15 more...)

1703.02679

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Sala, Frederic, Kabir, Shahroze, Broeck, Guy Van den, Dolecek, Lara

Don't Fear the Bit Flips: Optimized Coding Strategies for Binary Classification

After being trained, classifiers must often operate on data that has been corrupted by noise. In this paper, we consider the impact of such noise on the features of binary classifiers. Inspired by tools for classifier robustness, we introduce the same classification probability (SCP) to measure the resulting distortion on the classifier outputs. We introduce a low-complexity estimate of the SCP based on quantization and polynomial multiplication. We also study channel coding techniques based on replication error-correcting codes. In contrast to the traditional channel coding approach, where error-correction is meant to preserve the data and is agnostic to the application, our schemes specifically aim to maximize the SCP (equivalently minimizing the distortion of the classifier output) for the same redundancy overhead.

artificial intelligence, machine learning, probability, (19 more...)

1703.02641

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Media > Film (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Shashua, Shirli Di-Castro, Mannor, Shie

Deep Robust Kalman Filter

A Robust Markov Decision Process (RMDP) is a sequential decision making model that accounts for uncertainty in the parameters of dynamic systems. This uncertainty introduces difficulties in learning an optimal policy, especially for environments with large state spaces. We propose two algorithms, RTD-DQN and Deep-RoK, for solving large-scale RMDPs using nonlinear approximation schemes such as deep neural networks. The RTD-DQN algorithm incorporates the robust Bellman temporal difference error into a robust loss function, yielding robust policies for the agent. The Deep-RoK algorithm is a robust Bayesian method, based on the Extended Kalman Filter (EKF), that accounts for both the uncertainty in the weights of the approximated value function and the uncertainty in the transition probabilities, improving the robustness of the agent. We provide theoretical results for our approach and test the proposed algorithms on a continuous state domain.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1703.0231

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Grigo, Constantin, Koutsourelakis, Phaedon-Stelios

Probabilistic Reduced-Order Modeling for Stochastic Partial Differential Equations

arXiv.org Machine LearningMar-6-2017

We discuss a Bayesian formulation to coarse-graining (CG) of PDEs where the coefficients (e.g. material parameters) exhibit random, fine scale variability. The direct solution to such problems requires grids that are small enough to resolve this fine scale variability which unavoidably requires the repeated solution of very large systems of algebraic equations. We establish a physically inspired, data-driven coarse-grained model which learns a low- dimensional set of microstructural features that are predictive of the fine-grained model (FG) response. Once learned, those features provide a sharp distribution over the coarse scale effec- tive coefficients of the PDE that are most suitable for prediction of the fine scale model output. This ultimately allows to replace the computationally expensive FG by a generative proba- bilistic model based on evaluating the much cheaper CG several times. Sparsity enforcing pri- ors further increase predictive efficiency and reveal microstructural features that are important in predicting the FG response. Moreover, the model yields probabilistic rather than single-point predictions, which enables the quantification of the unavoidable epistemic uncertainty that is present due to the information loss that occurs during the coarse-graining process.

artificial intelligence, data mining, machine learning, (20 more...)

1703.01962

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)