AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Adversarial Robustness of Flow-Based Generative Models

Pope, Phillip, Balaji, Yogesh, Feizi, Soheil

arXiv.org Machine LearningNov-19-2019

Flow-based generative models leverage invertible generator functions to fit a distribution to the training data using maximum likelihood. Despite their use in several application domains, robustness of these models to adversarial attacks has hardly been explored. In this paper, we study adversarial robustness of flow-based generative models both theoretically (for some simple models) and empirically (for more complex ones). First, we consider a linear flow-based generative model and compute optimal sample-specific and universal adversarial perturbations that maximally decrease the likelihood scores. Using this result, we study the robustness of the well-known adversarial training procedure, where we characterize the fundamental trade-off between model robustness and accuracy. Next, we empirically study the robustness of two prominent deep, non-linear, flow-based generative models, namely GLOW and RealNVP. We design two types of adversarial attacks; one that minimizes the likelihood scores of in-distribution samples, while the other that maximizes the likelihood scores of out-of-distribution ones. We find that GLOW and RealNVP are extremely sensitive to both types of attacks. Finally, using a hybrid adversarial training procedure, we significantly boost the robustness of these generative models.

adversarial training, generative model, robustness, (15 more...)

arXiv.org Machine Learning

1911.08654

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(5 more...)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (0.57)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
(2 more...)

Add feedback

Information-Theoretic Local Minima Characterization and Regularization

Jia, Zhiwei, Su, Hao

arXiv.org Machine LearningNov-19-2019

A BSTRACT Recent advances in deep learning theory have evoked the study of generalizabil-ity across different local minima of deep neural networks (DNNs). While current work focused on either discovering properties of good local minima or developing regularization techniques to induce good local minima, no approach exists that can tackle both problems. We achieve these two goals successfully in a unified manner. Specifically, based on the Fisher information we propose a metric both strongly indicative of generalizability of local minima and effectively applied as a practical regularizer. We provide theoretical analysis including a generalization bound and empirically demonstrate the success of our approach in both capturing and improving the generalizability of DNNs. Experiments are performed on CIFAR-10 and CIFAR-100 for various network architectures. 1 I NTRODUCTION Recently, there has been a surge in the interest of acquiring a theoretical understanding over deep neural network's behavior. Breakthroughs have been made in characterizing the optimization process, showing that learning algorithms such as stochastic gradient descent (SGD) tend to end up in one of the many local minima which have close-to-zero training loss (Choromanska et al., 2015; Dauphin et al., 2014; Kawaguchi, 2016; Nguyen & Hein, 2018; Du et al., 2018). It is, therefore, natural to ask two closely related questions: (a) What kind of local minima can generalize better? To our knowledge, existing work focused only on one of the two questions. For the "what" question, various definitions of "flatness/sharpness" have been introduced and analyzed (Keskar et al., 2017; Neyshabur et al., 2018; 2017; Wu et al., 2017; Liang et al., 2017).

generalization, local minima, minima, (14 more...)

arXiv.org Machine Learning

1911.08192

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Automatic Detection of Satire in Bangla Documents: A CNN Approach Based on Hybrid Feature Extraction Model

Sharma, Arnab Sen, Mridul, Maruf Ahmed, Islam, Md Saiful

arXiv.org Artificial IntelligenceNov-19-2019

--Wide spread of satirical news in online communities is an ongoing trend. The nature of satires are so inherently ambiguous that sometimes it's too hard even for humans to understand whether it's actually satire or not. So, research interest has grown in this field. The purpose of this research is to detect Bangla satirical news spread in online news portals as well as social media. In this paper we propose a hybrid technique for extracting feature from text documents combining Word2V ecand TF-IDF. Using our proposed feature extraction technique, with standard CNN architecture we could detect whether a Bangla text document is satire or not with an accuracy of more than 96%. Satires can be considered as a literary form which involves a delicate balance between criticism and humor.

detection, vector, word2v ec model, (13 more...)

arXiv.org Artificial Intelligence

1911.11062

Country:

Asia > Bangladesh (0.05)
Europe > Ukraine (0.04)
Africa > Eritrea > Maekel > Asmara (0.04)

Genre: Research Report (0.50)

Industry:

Media > News (0.51)
Information Technology > Services (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

benedekrozemberczki/awesome-gradient-boosting-papers

#artificialintelligenceNov-18-2019, 14:53:48 GMT

How to Make AdaBoost.M1 Work for Weak Base Classifiers by Changing Only One Line of the Code (ECML 2002)

algorithm, classification, learning, (14 more...)

#artificialintelligence

Industry:

Education (0.47)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(3 more...)

Add feedback

Consistent recovery threshold of hidden nearest neighbor graphs

Ding, Jian, Wu, Yihong, Xu, Jiaming, Yang, Dana

arXiv.org Machine LearningNov-18-2019

Jian Ding, Yihong Wu, Jiaming Xu, and Dana Yang November 20, 2019 Abstract Motivated by applications such as discovering strong ties in social networks and assembling genome subsequences in biology, we study the problem of recovering a hidden 2 k -nearest neighbor (NN) graph in an n -vertex complete graph, whose edge weights are independent and distributed according to P n for edges in the hidden 2 k -NN graph and Q n otherwise. We focus on two types of asymptotic recovery guarantees as n: (1) exact recovery: all edges are classified correctly with probability tending to one; (2) almost exact recovery: the expected number of misclassified edges is o (nk). We show that the maximum likelihood estimator achieves (1) exact recovery for 2 k n o(1) if lim inf 2α n log n 1; (2) almost exact recovery for 1 k o null log n log log nnull if lim inf kD ( P n Q n) log n 1, where α n null 2 log null dP ndQ n is the R enyi divergence of order 1 2 and D (P n Q n) is the Kullback-Leibler divergence.

exact recovery, graph, recovery, (17 more...)

arXiv.org Machine Learning

1911.08004

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.93)
Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Deep Detector Health Management under Adversarial Campaigns

Echauz, Javier, Kenemer, Keith, Hussein, Sarfaraz, Dhaliwal, Jay, Shintre, Saurabh, Grzonkowski, Slawomir, Gardner, Andrew

arXiv.org Machine LearningNov-18-2019

Machine learning models are vulnerable to adversarial inputs that induce seemingly unjustifiable errors. As automated classifiers are increasingly used in industrial control systems and machinery, these adversarial errors could grow to be a serious problem. Despite numerous studies over the past few years, the field of adversarial ML is still considered alchemy, with no practical unbroken defenses demonstrated to date, leaving PHM practitioners with few meaningful ways of addressing the problem. We introduce turbidity detection as a practical superset of the adversarial input detection problem, coping with adversarial campaigns rather than statistically invisible one-offs. This perspective is coupled with ROCtheoretic design guidance that prescribes an inexpensive domain adaptation layer at the output of a deep learning model during an attack campaign. The result aims to approximate the Bayes optimal mitigation that ameliorates the detection model's degraded health. A proactively reactive type of prognostics is achieved via Monte Carlo simulation of various adversarial campaign scenarios, by sampling from the model's own turbidity distribution to quickly deploy the correct mitigation during a real-world campaign. A machine learning application often begins with a dataset of examples and the task is to find a classification model that will turn inputs into class-label predictions, while preserving some sense of minimum expected error. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. But less obviously, it is often possible to determin-istically find input examples that force the model to misclas-sify (Szegedy et al., 2014).

detector, figure 2, regular environment, (15 more...)

arXiv.org Machine Learning

1911.0809

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)
Health & Medicine (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Iterative Construction of Gaussian Process Surrogate Models for Bayesian Inference

Alawieh, Leen, Goodman, Jonathan, Bell, John B.

arXiv.org Machine LearningNov-17-2019

A new algorithm is developed to tackle the issue of sampling non-Gaussian model parameter posterior probability distributions that arise from solutions to Bayesian inverse problems. The algorithm aims to mitigate some of the hurdles faced by traditional Markov Chain Monte Carlo (MCMC) samplers, through constructing proposal probability densities that are both, easy to sample and that provide a better approximation to the target density than a simple Gaussian proposal distribution would. To achieve that, a Gaussian proposal distribution is augmented with a Gaussian Process (GP) surface that helps capture non-linearities in the log-likelihood function. In order to train the GP surface, an iterative approach is adopted for the optimal selection of points in parameter space. Optimality is sought by maximizing the information gain of the GP surface using a minimum number of forward model simulation runs. The accuracy of the GP-augmented surface approximation is assessed in two ways. The first consists of comparing predictions obtained from the approximate surface with those obtained through running the actual simulation model at hold-out points in parameter space. The second consists of a measure based on the relative variance of sample weights obtained from sampling the approximate posterior probability distribution of the model parameters. The efficacy of this new algorithm is tested on inferring reaction rate parameters in a 3-node and 6-node network toy problems, which imitate idealized reaction networks in combustion applications.

algorithm, experiment, training point, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jspi.2019.11.002

1911.07227

Country:

Asia > Middle East > Lebanon > Beirut Governorate > Beirut (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Causality-based Feature Selection: Methods and Evaluations

Yu, Kui, Guo, Xianjie, Liu, Lin, Li, Jiuyong, Wang, Hao, Ling, Zhaolong, Wu, Xindong

arXiv.org Artificial IntelligenceNov-16-2019

Feature selection is a crucial preprocessing step in data analytics and machine learning. Classical feature selection algorithms select features based on the correlations between predictive features and the class variable and do not attempt to capture causal relationships between them. It has been shown that the knowledge about the causal relationships between features and the class variable has potential benefits for building interpretable and robust prediction models, since causal relationships imply the underlying mechanism of a system. Consequently, causality-based feature selection has gradually attracted greater attentions and many algorithms have been proposed. In this paper, we present a comprehensive review of recent advances in causality-based feature selection. To facilitate the development of new algorithms in the research area and make it easy for the comparisons between new methods and existing ones, we develop the first open-source package, called CausalFS, which consists of most of the representative causality-based feature selection algorithms (available at https://github.com/kuiy/CausalFS). Using CausalFS, we conduct extensive experiments to compare the representative algorithms with both synthetic and real-world data sets. Finally, we discuss some challenging problems to be tackled in future causality-based feature selection research.

algorithm, class variable, cpc, (16 more...)

arXiv.org Artificial Intelligence

1911.07147

Country:

Asia > China > Anhui Province > Hefei (0.04)
Oceania > Australia > South Australia (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Add feedback

Causal inference using Bayesian non-parametric quasi-experimental design

Hinne, Max, van Gerven, Marcel A. J., Ambrogioni, Luca

arXiv.org Machine LearningNov-15-2019

The de facto standard for causal inference is the randomized controlled trial, where one compares an manipulated group with a control group in order to determine the effect of an intervention. However, this research design is not always realistically possible due to pragmatic or ethical concerns. In these situations, quasi-experimental designs may provide a solution, as these allow for causal conclusions at the cost of additional design assumptions. In this paper, we provide a generic framework for quasi-experimental design using Bayesian model comparison, and we show how it can be used as an alternative to several common research designs. We provide a theoretical motivation for a Gaussian process based approach and demonstrate its convenient use in a number of simulations. Finally, we apply the framework to determine the effect of population-based thresholds for municipality funding in France, of the 2005 smoking ban in Sicily on the number of acute coronary events, and of the effect of an alleged historical phantom border in the Netherlands on Dutch voting behaviour.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1911.06722

Country:

Europe > Netherlands (0.34)
Europe > Italy > Sicily (0.24)
Europe > France (0.24)
(7 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Add feedback

List of supervised and unsupervised Machine Learning Algorithms

#artificialintelligenceNov-14-2019, 05:57:00 GMT

There are a lot of machine learning algorithm available and among them there are some which are used mostly among the use in daily cases. Some of the Machine Learning algorithm are mentioned.

machine learning algorithm, unsupervised machine learning algorithm

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.64)

Add feedback