AITopics

2104.13484

Country:

Oceania > Australia (0.14)
Europe > Italy (0.14)
Europe > Belgium (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.50)
Government > Military (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceApr-27-2021

Text Generation with Deep Variational GAN

Hossam, Mahmoud, Le, Trung, Papasimeon, Michael, Huynh, Viet, Phung, Dinh

Generating realistic sequences is a central task in many machine learning applications. There has been considerable recent progress on building deep generative models for sequence generation tasks. However, the issue of mode-collapsing remains a main issue for the current models. In this paper we propose a GAN-based generic framework to address the problem of mode-collapse in a principled approach. We change the standard GAN objective to maximize a variational lower-bound of the log-likelihood while minimizing the Jensen-Shanon divergence between data and model distributions. We experiment our model with text generation task and show that it can generate realistic text with high diversity.

deep learning, neural network, unk, (18 more...)

2104.13488

Country: North America > United States > Oregon (0.14)

Genre: Research Report (0.40)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Machine LearningFeb-11-2021

BoMb-OT: On Batch of Mini-batches Optimal Transport

Nguyen, Khai, Nguyen, Quoc, Ho, Nhat, Pham, Tung, Bui, Hung, Phung, Dinh, Le, Trung

Mini-batch optimal transport (m-OT) has been successfully used in practical applications that involve probability measures with intractable density, or probability measures with a very high number of supports. The m-OT solves several sparser optimal transport problems and then returns the average of their costs and transportation plans. Despite its scalability advantage, m-OT is not a proper metric between probability measures since it does not satisfy the identity property. To address this problem, we propose a novel mini-batching scheme for optimal transport, named Batch of Mini-batches Optimal Transport (BoMb-OT), that can be formulated as a well-defined distance on the space of probability measures. Furthermore, we show that the m-OT is a limit of the entropic regularized version of the proposed BoMb-OT when the regularized parameter goes to infinity. We carry out extensive experiments to show that the new mini-batching scheme can estimate a better transportation plan between two original measures than m-OT. It leads to a favorable performance of BoMb-OT in the matching and color transfer tasks. Furthermore, we observe that BoMb-OT also provides a better objective loss than m-OT for doing approximate Bayesian computation, estimating parameters of interest in parametric generative models, and learning non-parametric generative models with gradient flow.

bayesian inference, bomb-ot, neural network, (20 more...)

2102.05912

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

arXiv.org Artificial IntelligenceJan-25-2021

Understanding and Achieving Efficient Robustness with Adversarial Contrastive Learning

Bui, Anh, Le, Trung, Zhao, He, Montague, Paul, Camtepe, Seyit, Phung, Dinh

Among them, the adversarial training methods (e.g, FGSM, PGD adversarial training [13, 22] and Contrastive learning (CL) has recently emerged as an TRADES [36] that utilize adversarial examples as training effective approach to learning representation in a range of data, have been one of the most effective approaches, which downstream tasks. Central to this approach is the selection truly boost the model robustness without the facing the of positive (similar) and negative (dissimilar) sets to provide problem of obfuscated gradients [3]. In adversarial training, the model the opportunity to'contrast' between data recent works [34, 4] show that reducing the divergence and class representation in the latent space. In this paper, of the representations of images and their adversarial examples we investigate CL for improving model robustness using adversarial in latent space (e.g., the feature space output from an samples. We first designed and performed a comprehensive intermediate layer of a classifier) can significantly improve study to understand how adversarial vulnerability the robustness. For example, in [4], latent representations behaves in the latent space. Based on these empirical of images in the same class are pulled closer together than evidences, we propose an effective and efficient supervised those in different classes, which led to a more compact latent contrastive learning to achieve model robustness against space and consequently, better robustness.

artificial intelligence, neural network, robustness, (16 more...)

2101.10027

Country:

Oceania > Australia (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Machine LearningOct-15-2020

Neural Topic Model via Optimal Transport

Zhao, He, Phung, Dinh, Huynh, Viet, Le, Trung, Buntine, Wray

Recently, Neural Topic Models (NTMs) inspired by variational autoencoders have obtained increasingly research interest due to their promising results on text analysis. However, it is usually hard for existing NTMs to achieve good document representation and coherent/diverse topics at the same time. Moreover, they often degrade their performance severely on short documents. The requirement of reparameterisation could also comprise their training quality and model flexibility. To address these shortcomings, we present a new neural topic model via the theory of optimal transport (OT). Specifically, we propose to learn the topic distribution of a document by directly minimising its OT distance to the document's word distributions. Importantly, the cost matrix of the OT distance models the weights between topics and words, which is constructed by the distances between topics and words in an embedding space. Our proposed model can be trained efficiently with a differentiable loss. Extensive experiments show that our framework significantly outperforms the state-of-the-art NTMs on discovering more coherent and diverse topics and deriving better document representations for both regular and short texts.

neural network, representation, soccer, (23 more...)

2008.13537

Country:

North America > United States (1.00)
Asia (0.68)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry:

Government (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)

arXiv.org Machine LearningOct-15-2020

OptiGAN: Generative Adversarial Networks for Goal Optimized Sequence Generation

Hossam, Mahmoud, Le, Trung, Huynh, Viet, Papasimeon, Michael, Phung, Dinh

One of the challenging problems in sequence generation tasks is the optimized generation of sequences with specific desired goals. Current sequential generative models mainly generate sequences to closely mimic the training data, without direct optimization of desired goals or properties specific to the task. We introduce OptiGAN, a generative model that incorporates both Generative Adversarial Networks (GAN) and Reinforcement Learning (RL) to optimize desired goal scores using policy gradients. We apply our model to text and real-valued sequence generation, where our model is able to achieve higher desired scores out-performing GAN and RL baselines, while not sacrificing output sample diversity.

air transportation, deep learning, trajectory, (19 more...)

2004.07534

Country:

Oceania > Australia (0.14)
North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Government > Military (0.48)
Transportation > Air (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

arXiv.org Machine LearningJul-9-2020

Improving Adversarial Robustness by Enforcing Local and Global Compactness

Bui, Anh, Le, Trung, Zhao, He, Montague, Paul, deVel, Olivier, Abraham, Tamas, Phung, Dinh

The fact that deep neural networks are susceptible to crafted perturbations severely impacts the use of deep learning in certain domains of application. Among many developed defense models against such attacks, adversarial training emerges as the most successful method that consistently resists a wide range of attacks. In this work, based on an observation from a previous study that the representations of a clean data example and its adversarial examples become more divergent in higher layers of a deep neural net, we propose the Adversary Divergence Reduction Network which enforces local/global compactness and the clustering assumption over an intermediate layer of a deep neural network. We conduct comprehensive experiments to understand the isolating behavior of each component (i.e., local/global compactness and the clustering assumption) and compare our proposed model with state-of-the-art adversarial training methods. The experimental results demonstrate that augmenting adversarial training with our proposed components can further improve the robustness of the network, leading to higher unperturbed and adversarial predictive performances.

adversarial example, deep learning, neural network, (18 more...)

2007.05123

Country: Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningOct-3-2019

Perturbations are not Enough: Generating Adversarial Examples with Spatial Distortions

Zhao, He, Le, Trung, Montague, Paul, De Vel, Olivier, Abraham, Tamas, Phung, Dinh

Deep neural network image classifiers are reported to be susceptible to adversarial evasion attacks, which use carefully crafted images created to mislead a classifier. Recently, various kinds of adversarial attack methods have been proposed, most of which focus on adding small perturbations to input images. Despite the success of existing approaches, the way to generate realistic adversarial images with small perturbations remains a challenging problem. In this paper, we aim to address this problem by proposing a novel adversarial method, which generates adversarial examples by imposing not only perturbations but also spatial distortions on input images, including scaling, rotation, shear, and translation. As humans are less susceptible to small spatial distortions, the proposed approach can produce visually more realistic attacks with smaller perturbations, able to deceive classifiers without affecting human predictions. We learn our method by amortized techniques with neural networks and generate adversarial examples efficiently by a forward pass of the networks. Extensive experiments on attacking different types of non-robustified classifiers and robust classifiers with defence show that our method has state-of-the-art performance in comparison with advanced attack parallels.

deep learning, neural network, perturbation, (18 more...)

1910.01329

Country: Oceania > Australia (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.37)
Government (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Machine LearningJan-24-2019

When Can Neural Networks Learn Connected Decision Regions?

Le, Trung, Phung, Dinh

Previous work has questioned the conditions under which the decision regions of a neural network are connected and further showed the implications of the corresponding theory to the problem of adversarial manipulation of classifiers. It has been proven that for a class of activation functions including leaky ReLU, neural networks having a pyramidal structure, that is no layer has more hidden units than the input dimension, produce necessarily connected decision regions. In this paper, we advance this important result by further developing the sufficient and necessary conditions under which the decision regions of a neural network are connected. We then apply our framework to overcome the limits of existing work and further study the capacity to learn connected regions of neural networks for a much wider class of activation functions including those widely used, namely ReLU, sigmoid, tanh, softlus, and exponential linear function.

deep learning, neural network, rect, (17 more...)

1901.0871

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

arXiv.org Artificial IntelligenceNov-15-2018

Theoretical Perspective of Deep Domain Adaptation

Le, Trung, Nguyen, Khanh, Phung, Dinh

Deep domain adaptation has recently undergone a big success. Compared with shallow domain adaptation, deep domain adaptation has shown higher predictive performance and stronger capacity to tackle structural data (e.g., image and sequential data). The underlying idea of deep domain adaptation is to bridge the gap between source and target domains in a joint feature space so that a supervised classifier trained on labeled source data can be nicely transferred to the target domain. This idea is certainly appealing and motivating, but under the theoretical perspective, none of the theory has been developed to support this. In this paper, we have developed a rigorous theory to explain why we can bridge the relevant gap in an intermediate joint space. Under the light of our proposed theory, it turns out that there is a strong connection between deep domain adaptation and Wasserstein (WS) distance. More specifically, our theory revolves the following points: i) first, we propose a context wherein we can perfectly perform a transfer learning and ii) second, we further prove that by means of bridging the relevant gap and minimizing some reconstruction errors we are minimizing a WS distance between the push forward source distribution and the target distribution via a transport that maps from the source to target domains.

artificial intelligence, domain adaptation, machine learning, (19 more...)

1811.06199

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.36)