AITopics

Industry: Banking & Finance (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

#artificialintelligenceOct-26-2019, 11:28:57 GMT

A Gentle Introduction to Cross-Entropy for Machine Learning

Cross-entropy is commonly used in machine learning as a loss function. Cross-entropy is a measure from the field of information theory, building upon entropy and generally calculating the difference between two probability distributions. It is closely related to but is different from KL divergence that calculates the relative entropy between two probability distributions, whereas cross-entropy can be thought to calculate the total entropy between the distributions. Cross-entropy is also related to and often confused with logistic loss, called log loss. Although the two measures are derived from a different source, when used as loss functions for classification models, both measures calculate the same quantity and can be used interchangeably. In this tutorial, you will discover cross-entropy for machine learning.

entropy, kl divergence, probability distribution, (13 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.30)

#artificialintelligenceOct-26-2019, 11:28:57 GMT

A Gentle Introduction to Cross-Entropy for Machine Learning

entropy, kl divergence, probability distribution, (13 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.30)

#artificialintelligenceOct-26-2019, 06:15:28 GMT

A Gentle Introduction to Maximum Likelihood Estimation for Machine Learning

Density estimation is the problem of estimating the probability distribution for a sample of observations from a problem domain. There are many techniques for solving density estimation, although a common framework used throughout the field of machine learning is maximum likelihood estimation. Maximum likelihood estimation involves defining a likelihood function for calculating the conditional probability of observing the data sample given a probability distribution and distribution parameters. This approach can be used to search a space of possible distributions and parameters. This flexible probabilistic framework also provides the foundation for many machine learning algorithms, including important methods such as linear regression and logistic regression for predicting numeric values and class labels respectively, but also more generally for deep learning artificial neural networks.

maximum likelihood estimation, probability, probability distribution, (9 more...)

Genre: Instructional Material (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningOct-26-2019

Fair Generative Modeling via Weak Supervision

Grover, Aditya, Choi, Kristy, Shu, Rui, Ermon, Stefano

Real-world datasets are often biased with respect to key demographic factors such as race and gender. Due to the latent nature of the underlying factors, detecting and mitigating bias is especially challenging for unsupervised machine learning. We present a weakly supervised algorithm for overcoming dataset bias for deep generative models. Our approach requires access to an additional small, unlabeled but unbiased dataset as the supervision signal, thus sidestepping the need for explicit labels on the underlying bias factors. Using this supplementary dataset, we detect the bias in existing datasets via a density ratio technique and learn generative models which efficiently achieve the twin goals of: 1) data efficiency by using training examples from both biased and unbiased datasets for learning, 2) unbiased data generation at test time. Empirically, we demonstrate the efficacy of our approach which reduces bias w.r.t. latent factors by 57.1% on average over baselines for comparable image generation using generative adversarial networks.

artificial intelligence, classifier, machine learning, (16 more...)

1910.12008

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Pal, Soumyasundar, Regol, Florence, Coates, Mark

Bayesian Graph Convolutional Neural Networks Using Non-Parametric Graph Learning

arXiv.org Machine LearningOct-26-2019

Graph convolutional neural networks (GCNN) have been successfully applied to many different graph based learning tasks including node and graph classification, matrix completion, and learning of node embeddings. Despite their impressive performance, the techniques have a limited capability to incorporate the uncertainty in the underlined graph structure. In order to address this issue, a Bayesian GCNN (BGCN) framework was recently proposed. In this framework, the observed graph is considered to be a random realization from a parametric random graph model and the joint Bayesian inference of the graph and GCNN weights is performed. In this paper, we propose a non-parametric generative model for graphs and incorporate it within the BGCN framework. In addition to the observed graph, our approach effectively uses the node features and training labels in the posterior inference of graphs and attains superior or comparable performance in benchmark node classification tasks.

graph, neural network, proc, (15 more...)

1910.12132

Country:

North America > United States > Arizona > Maricopa County > Scottsdale (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Netherlands (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Hegde, Srinidhi, Prasad, Ranjitha, Hebbalaguppe, Ramya, Kumar, Vishwajith

Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework

arXiv.org Machine LearningOct-26-2019

The holy grail in deep neural network research is porting the memory- and computation-intensive network models on embedded platforms with a minimal compromise in model accuracy. To this end, we propose a novel approach, termed as Variational Student, where we reap the benefits of compressibility of the knowledge distillation (KD) framework, and sparsity inducing abilities of variational inference (VI) techniques. Essentially, we build a sparse student network, whose sparsity is induced by the variational parameters found via optimizing a loss function based on VI, leveraging the knowledge learnt by an accurate but complex pre-trained teacher network. Further, for sparsity enhancement, we also employ a Block Sparse Regularizer on a concatenated tensor of teacher and student network weights. We demonstrate that the marriage of KD and the VI techniques inherits compression properties from the KD framework, and enhances levels of sparsity from the VI approach, with minimal compromise in the model accuracy. We benchmark our results on LeNet MLP and VGGNet (CNN) and illustrate a memory footprint reduction of 64x and 213x on these MLP and CNN variants, respectively, without a need to retrain the teacher network. Furthermore, in the low data regime, we observed that our method outperforms state-of-the-art Bayesian techniques in terms of accuracy.

sparsity, student network, teacher network, (14 more...)

1910.12061

Country:

Asia > India > West Bengal > Kharagpur (0.04)
Asia > India > NCT > New Delhi (0.04)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

#artificialintelligenceOct-25-2019, 23:17:27 GMT

11 Alternatives To Keras For Deep Learning Enthusiasts

Infer.NET is a machine learning framework for running Bayesian inference in graphical models. It provides state-of-the-art message-passing algorithms and statistical routines needed to perform inference for a wide variety of applications. There are various intuitive features in this framework such as rich modelling language, multiple inference algorithms, designed for large scale inference as well as user-extendable. With the help of this framework, various Bayesian models such as Bayes Point Machine classifiers, TrueSkill matchmaking, hidden Markov models, and Bayesian networks can be implemented with ease.

deep learning enthusiast, inference, keras

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.75)

arXiv.org Machine LearningOct-25-2019

Descriptive Dimensionality and Its Characterization of MDL-based Learning and Change Detection

Yamanishi, Kenji

This paper introduces a new notion of dimensionality of probabilistic models from an information-theoretic view point. We call it the "descriptive dimension"(Ddim). We show that Ddim coincides with the number of independent parameters for the parametric class, and can further be extended to real-valued dimensionality when a number of models are mixed. The paper then derives the rate of convergence of the MDL (Minimum Description Length) learning algorithm which outputs a normalized maximum likelihood (NML) distribution with model of the shortest NML codelength. The paper proves that the rate is governed by Ddim. The paper also derives error probabilities of the MDL-based test for multiple model change detection. It proves that they are also governed by Ddim. Through the analysis, we demonstrate that Ddim is an intrinsic quantity which characterizes the performance of the MDL-based learning and change detection.

ddim, dimensionality, nml, (14 more...)

1910.1154

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Rajendran, Vickram, LeVine, William

Accurate Layerwise Interpretable Competence Estimation

arXiv.org Machine LearningOct-24-2019

Estimating machine learning performance 'in the wild' is both an important and unsolved problem. In this paper, we seek to examine, understand, and predict the pointwise competence of classification models. Our contributions are twofold: First, we establish a statistically rigorous definition of competence that generalizes the common notion of classifier confidence; second, we present the ALICE (Accurate Layerwise Interpretable Competence Estimation) Score, a pointwise competence estimator for any classifier. By considering distributional, data, and model uncertainty, ALICE empirically shows accurate competence estimation in common failure situations such as class-imbalanced datasets, out-of-distribution datasets, and poorly trained models. Our contributions allow us to accurately predict the competence of any classification model given any input and error function. We compare our score with state-of-the-art confidence estimators such as model confidence and Trust Score, and show significant improvements in competence prediction over these methods on datasets such as DIGITS, CIFAR10, and CIFAR100.

alice score, competence, error function, (13 more...)

1910.11363

Country:

North America > Canada > Ontario > Toronto (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Maryland > Prince George's County > Laurel (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)