AITopics

2101.10435

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(20 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.69)
(2 more...)

arXiv.org Artificial IntelligenceJan-25-2021

Understanding and Achieving Efficient Robustness with Adversarial Contrastive Learning

Bui, Anh, Le, Trung, Zhao, He, Montague, Paul, Camtepe, Seyit, Phung, Dinh

Among them, the adversarial training methods (e.g, FGSM, PGD adversarial training [13, 22] and Contrastive learning (CL) has recently emerged as an TRADES [36] that utilize adversarial examples as training effective approach to learning representation in a range of data, have been one of the most effective approaches, which downstream tasks. Central to this approach is the selection truly boost the model robustness without the facing the of positive (similar) and negative (dissimilar) sets to provide problem of obfuscated gradients [3]. In adversarial training, the model the opportunity to'contrast' between data recent works [34, 4] show that reducing the divergence and class representation in the latent space. In this paper, of the representations of images and their adversarial examples we investigate CL for improving model robustness using adversarial in latent space (e.g., the feature space output from an samples. We first designed and performed a comprehensive intermediate layer of a classifier) can significantly improve study to understand how adversarial vulnerability the robustness. For example, in [4], latent representations behaves in the latent space. Based on these empirical of images in the same class are pulled closer together than evidences, we propose an effective and efficient supervised those in different classes, which led to a more compact latent contrastive learning to achieve model robustness against space and consequently, better robustness.

adversarial example, divergence, robustness, (13 more...)

2101.10027

Country:

Oceania > Australia (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Cai, Zhaowei, Ravichandran, Avinash, Maji, Subhransu, Fowlkes, Charless, Tu, Zhuowen, Soatto, Stefano

Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning

arXiv.org Artificial IntelligenceJan-21-2021

We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. Unlike the standard BN, where the statistics are computed within each batch, EMAN, used in the teacher, updates its statistics by exponential moving average from the BN statistics of the student. This design reduces the intrinsic cross-sample dependency of BN and enhance the generalization of the teacher. EMAN improves strong baselines for self-supervised learning by 4-6/1-2 points and semi-supervised learning by about 7/2 points, when 1%/10% supervised labels are available on ImageNet. These improvements are consistent across methods, network architectures, training duration, and datasets, demonstrating the general effectiveness of this technique.

eman, learning, statistics, (16 more...)

2101.08482

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)

Genre: Research Report (0.64)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Terashita, Naoyuki, Ohashi, Hiroki, Nonaka, Yuichi, Kanemaru, Takashi

Influence Estimation for Generative Adversarial Networks

arXiv.org Machine LearningJan-20-2021

Identifying harmful instances, whose absence in a training dataset improves model performance, is important for building better machine learning models. Although previous studies have succeeded in estimating harmful instances under supervised settings, they cannot be trivially extended to generative adversarial networks (GANs). This is because previous approaches require that (1) the absence of a training instance directly affects the loss value and that (2) the change in the loss directly measures the harmfulness of the instance for the performance of a model. In GAN training, however, neither of the requirements is satisfied. This is because, (1) the generator's loss is not directly affected by the training instances as they are not part of the generator's training steps, and (2) the values of GAN's losses normally do not capture the generative performance of a model. To this end, (1) we propose an influence estimation method that uses the Jacobian of the gradient of the generator's loss with respect to the discriminator's parameters (and vice versa) to trace how the absence of an instance in the discriminator's training affects the generator's parameters, and (2) we propose a novel evaluation scheme, in which we assess harmfulness of each training instance on the basis of how GAN evaluation metric (e.g., inception score) is expect to change due to the removal of the instance. We experimentally verified that our influence estimation method correctly inferred the changes in GAN evaluation metrics. Further, we demonstrated that the removal of the identified harmful instances effectively improved the model's generative performance with respect to various GAN evaluation metrics.

generative adversarial network, influence estimation

2101.08367

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Kutuzova, Svetlana, Krause, Oswin, McCloskey, Douglas, Nielsen, Mads, Igel, Christian

Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-Experts

arXiv.org Artificial IntelligenceJan-18-2021

Multimodal generative modelling is important because information about real-world objects typically comes in different representations, or modalities. The information provided by each modality may be erroneous and/or incomplete, and a complete reconstruction of the full information can often only be achieved by combining several modalities. For example, in image-and video-guided translation (Caglayan et al., 2019), additional visual context can potentially resolve ambiguities (e.g., noun genders) when translating written text. In many applications, modalities may be missing for a subset of the observed samples during training and deployment. Often the description of an object in one modality is easy to obtain, while annotating it with another modality is slow and expensive. Given two modalities, we call samples paired when both modalities are present, and unpaired if one is missing. The simplest way to deal with paired and unpaired training examples is to discard the unpaired observations for learning.

architecture, modality, reconstruction, (15 more...)

2101.0724

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > California (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Ghosh, Atin, Thiery, Alexandre H.

On Data-Augmentation and Consistency-Based Semi-Supervised Learning

arXiv.org Machine LearningJan-18-2021

Recently proposed consistency-based Semi-Supervised Learning (SSL) methods such as the $\Pi$-model, temporal ensembling, the mean teacher, or the virtual adversarial training, have advanced the state of the art in several SSL tasks. These methods can typically reach performances that are comparable to their fully supervised counterparts while using only a fraction of labelled examples. Despite these methodological advances, the understanding of these methods is still relatively limited. In this text, we analyse (variations of) the $\Pi$-model in settings where analytically tractable results can be obtained. We establish links with Manifold Tangent Classifiers and demonstrate that the quality of the perturbations is key to obtaining reasonable SSL performances. Importantly, we propose a simple extension of the Hidden Manifold Model that naturally incorporates data-augmentation schemes and offers a framework for understanding and experimenting with SSL methods.

arxiv preprint arxiv, conference paper, data-augmentation scheme, (12 more...)

2101.06967

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Badreddine, Samy, Garcez, Artur d'Avila, Serafini, Luciano, Spranger, Michael

Logic Tensor Networks

arXiv.org Artificial IntelligenceJan-16-2021

Artificial Intelligence agents are required to learn from their surroundings and to reason about the knowledge that has been learned in order to make decisions. While state-of-the-art learning from data typically uses sub-symbolic distributed representations, reasoning is normally useful at a higher level of abstraction with the use of a first-order logic language for knowledge representation. As a result, attempts at combining symbolic AI and neural computation into neural-symbolic systems have been on the increase. In this paper, we present Logic Tensor Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning through the introduction of a many-valued, end-to-end differentiable first-order logic called Real Logic as a representation language for deep learning. We show that LTN provides a uniform language for the specification and the computation of several AI tasks such as data clustering, multi-label classification, relational learning, query answering, semi-supervised learning, regression and embedding learning. We implement and illustrate each of the above tasks with a number of simple explanatory examples using TensorFlow 2. Keywords: Neurosymbolic AI, Deep Learning and Reasoning, Many-valued Logic.

formula, knowledge, neural network, (14 more...)

2012.13635

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Leibfried, Felix, Dutordoir, Vincent, John, ST, Durrande, Nicolas

A Tutorial on Sparse Gaussian Processes and Variational Inference

arXiv.org Machine LearningJan-15-2021

Gaussian processes (GPs) provide a framework for Bayesian inference that can offer principled uncertainty estimates for a large range of problems. For example, if we consider regression problems with Gaussian likelihoods, a GP model enjoys a posterior in closed form. However, identifying the posterior GP scales cubically with the number of training examples and requires to store all examples in memory. In order to overcome these obstacles, sparse GPs have been proposed that approximate the true posterior GP with pseudo-training examples. Importantly, the number of pseudo-training examples is user-defined and enables control over computational and memory complexity. In the general case, sparse GPs do not enjoy closed-form solutions and one has to resort to approximate inference. In this context, a convenient choice for approximate inference is variational inference (VI), where the problem of Bayesian inference is cast as an optimization problem -- namely, to maximize a lower bound of the log marginal likelihood. This paves the way for a powerful and versatile framework, where pseudo-training examples are treated as optimization arguments of the approximate posterior that are jointly identified together with hyperparameters of the generative model (i.e. prior and likelihood). The framework can naturally handle a wide scope of supervised learning problems, ranging from regression with heteroscedastic and non-Gaussian likelihoods to classification problems with discrete labels, but also multilabel problems. The purpose of this tutorial is to provide access to the basic matter for readers without prior knowledge in both GPs and VI. A proper exposition to the subject enables also access to more recent advances (like importance-weighted VI as well as inderdomain, multioutput and deep GPs) that can serve as an inspiration for new research ideas.

equation, formulation, likelihood, (15 more...)

2012.13962

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)
North America > United States (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Patsantzis, Stassa, Muggleton, Stephen H.

Top Program Construction and Reduction for polynomial time Meta-Interpretive Learning

arXiv.org Artificial IntelligenceJan-13-2021

Meta-Interpretive Learners, like most ILP systems, learn by searching for a correct hypothesis in the hypothesis space, the powerset of all constructible clauses. We show how this exponentially-growing search can be replaced by the construction of a Top program: the set of clauses in all correct hypotheses that is itself a correct hypothesis. We give an algorithm for Top program construction and show that it constructs a correct Top program in polynomial time and from a finite number of examples. We implement our algorithm in Prolog as the basis of a new MIL system, Louise, that constructs a Top program and then reduces it by removing redundant clauses. We compare Louise to the state-of-the-art search-based MIL system Metagol in experiments on grid world navigation, graph connectedness and grammar learning datasets and find that Louise improves on Metagol's predictive accuracy when the hypothesis space and the target theory are both large, or when the hypothesis space does not include a correct hypothesis because of "classification noise" in the form of mislabelled examples. When the hypothesis space or the target theory are small, Louise and Metagol perform equally well.

correct hypothesis, hypothesis, metagol, (11 more...)

2101.0505

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Lederer, Armin, Umlauft, Jonas, Hirche, Sandra

Uniform Error and Posterior Variance Bounds for Gaussian Process Regression with Application to Safe Control

arXiv.org Machine LearningJan-13-2021

In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bounds rely on prior knowledge, which might not be available for many real-world tasks. (ii) The relationship between training data and the posterior variance, which mainly drives the error bound, is not well understood and prevents the asymptotic analysis. This article addresses these issues by presenting a novel uniform error bound using Lipschitz continuity and an analysis of the posterior variance function for a large class of kernels. Additionally, we show how these results can be used to guarantee safe control of an unknown dynamical system and provide numerical illustration examples.

gaussian process, kernel, posterior variance, (12 more...)

2101.05328

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)