AITopics | Banff

Collaborating Authors

Banff

Sampling the "Inverse Set" of a Neuron: An Approach to Understanding Neural Nets

Hada, Suryabhan Singh, Carreira-Perpiñán, Miguel Á.

arXiv.org Machine LearningSep-26-2019

With the recent success of deep neural networks in computer vision, it is important to understand the internal working of these networks. What does a given neuron represent? The concepts captured by a neuron may be hard to understand or express in simple terms. The approach we propose in this paper is to characterize the region of input space that excites a given neuron to a certain level; we call this the inverse set. This inverse set is a complicated high dimensional object that we explore by an optimization-based sampling approach. Inspection of samples of this set by a human can reveal regularities that help to understand the neuron. This goes beyond approaches which were limited to finding an image which maximally activates the neuron or using Markov chain Monte Carlo to sample images, but this is very slow, generates samples with little diversity and lacks control over the activation value of the generated samples. Our approach also allows us to explore the intersection of inverse sets of several neurons and other variations.

feasible region, neural network, neuron, (14 more...)

arXiv.org Machine Learning

1910.04857

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Say What I Want: Towards the Dark Side of Neural Dialogue Models

Liu, Haochen, Derr, Tyler, Liu, Zitao, Tang, Jiliang

arXiv.org Artificial IntelligenceSep-23-2019

Neural dialogue models have been widely adopted in various chatbot applications because of their good performance in simulating and generalizing human conversations. However, there exists a dark side of these models -- due to the vulnerability of neural networks, a neural dialogue model can be manipulated by users to say what they want, which brings in concerns about the security of practical chatbot services. In this work, we investigate whether we can craft inputs that lead a well-trained black-box neural dialogue model to generate targeted outputs. We formulate this as a reinforcement learning (RL) problem and train a Reverse Dialogue Generator which efficiently finds such inputs for targeted outputs. Experiments conducted on a representative neural dialogue model show that our proposed model is able to discover such desired inputs in a considerable portion of cases. Overall, our work reveals this weakness of neural dialogue models and may prompt further researches of developing corresponding solutions to avoid it.

dialogue model, neural dialogue model, similarity, (15 more...)

arXiv.org Artificial Intelligence

1909.06044

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
Europe > Austria > Vienna (0.14)
North America > United States > Michigan (0.04)
(17 more...)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (0.68)
Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Task Selection Policies for Multitask Learning

Glover, John, Hokamp, Chris

arXiv.org Machine LearningJul-14-2019

One of the questions that arises when designing models that learn to solve multiple tasks simultaneously is how much of the available training budget should be devoted to each individual task. We refer to any formalized approach to addressing this problem (learned or otherwise) as a task selection policy. In this work we provide an empirical evaluation of the performance of some common task selection policies in a synthetic bandit-style setting, as well as on the GLUE benchmark for natural language understanding. We connect task selection policy learning to existing work on automated curriculum learning and off-policy evaluation, and suggest a method based on counterfactual estimation that leads to improved model performance in our experimental settings.

arxiv, learning, task selection policy, (14 more...)

arXiv.org Machine Learning

1907.06214

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Plymouth County > Norwell (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.51)
(2 more...)

Add feedback

Generative Restricted Kernel Machines

Pandey, Arun, Schreurs, Joachim, Suykens, Johan A. K.

arXiv.org Machine LearningJun-19-2019

Generative modeling is a rapidly advancing area of machine learning research finding applications in multiple fields such as, generated art, on-demand video, image denoising [1], exploration in reinforcement learning [2], collaborative filtering [3], inpainting [4] and many more. In general, three approaches have been used in generative modeling tasks. First, graphical models based on a probabilistic framework with latent variables such as variational auto-encoders [5] and Restricted Boltzmann Machines (RBMs) [6, 7]. Then, more recently proposed models based on adversarial training such as Generative Adversarial Networks (GANs) [8] and its many variants. Furthermore, autoregressive models such as Pixel Recurrent Neural Networks (PixelRNNs) [9] that models the conditional distribution of every individual pixel given previous pixels and generation involves sequentially predicting the pixels in an image along the two spatial dimensions.

artificial intelligence, feature map, machine learning, (18 more...)

arXiv.org Machine Learning

1906.08144

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
North America > United States > Oregon (0.04)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stability of Graph Scattering Transforms

Gama, Fernando, Bruna, Joan, Ribeiro, Alejandro

arXiv.org Machine LearningJun-11-2019

Scattering transforms are non-trainable deep convolutional architectures that exploit the multi-scale resolution of a wavelet filter bank to obtain an appropriate representation of data. More importantly, they are proven invariant to translations, and stable to perturbations that are close to translations. This stability property dons the scattering transform with a robustness to small changes in the metric domain of the data. When considering network data, regular convolutions do not hold since the data domain presents an irregular structure given by the network topology. In this work, we extend scattering transforms to network data by using multiresolution graph wavelets, whose computation can be obtained by means of graph convolutions. Furthermore, we prove that the resulting graph scattering transforms are stable to metric perturbations of the underlying network. This renders graph scattering transforms robust to changes on the network topology, making it particularly useful for cases of transfer learning, topology estimation or time-varying graphs.

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Machine Learning

1906.04784

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (0.40)

Industry:

Telecommunications > Networks (0.54)
Information Technology > Networks (0.54)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Continual learning with hypernetworks

von Oswald, Johannes, Henning, Christian, Sacramento, João, Grewe, Benjamin F.

arXiv.org Artificial IntelligenceJun-3-2019

Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based on task-conditioned hypernetworks, i.e., networks that generate the weights of a target model based on task identity. Continual learning (CL) is less difficult for this class of models thanks to a simple key observation: instead of relying on recalling the input-output relations of all previously seen data, task-conditioned hypernetworks only require rehearsing previous weight realizations, which can be maintained in memory using a simple regularizer. Besides achieving good performance on standard CL benchmarks, additional experiments on long task sequences reveal that task-conditioned hypernetworks display an unprecedented capacity to retain previous memories. Notably, such long memory lifetimes are achieved in a compressive regime, when the number of trainable weights is comparable or smaller than target network size. We provide insight into the structure of low-dimensional task embedding spaces (the input space of the hypernetwork) and show that task-conditioned hypernetworks demonstrate transfer learning properties. Finally, forward information transfer is further supported by empirical results on a challenging CL benchmark based on the CIFAR-10/100 image datasets.

artificial intelligence, hypernetwork, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1906.00695

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.68)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Explainability Techniques for Graph Convolutional Networks

Baldassarre, Federico, Azizpour, Hossein

arXiv.org Artificial IntelligenceMay-31-2019

Graph Networks are used to make decisions in potentially complex scenarios but it is usually not obvious how or why they made them. In this work, we study the explainability of Graph Network decisions using two main classes of techniques, gradient-based and decomposition-based, on a toy dataset and a chemistry task. Our study sets the ground for future development as well as application to real-world problems.

data mining, explanation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1905.13686

Country:

Europe > France (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(4 more...)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Quantization-Based Regularization for Autoencoders

Wu, Hanwei, Gattami, Ather, Flierl, Markus

arXiv.org Machine LearningMay-27-2019

Autoencoders and their variations provide unsupervised models for learning low-dimensional representations for downstream tasks. Without proper regularization, autoencoder models are susceptible to the overfitting problem and the so-called posterior collapse phenomenon. In this paper, we introduce a quantization-based regularizer in the bottleneck stage of autoencoder models to learn meaningful latent representations. We combine both perspectives of Vector Quantized-Variational AutoEncoders (VQ-VAE) and classical denoising regularization schemes of neural networks. We interpret quantizers as regularizers that constrain latent representations while fostering a similarity mapping at the encoder. Before quantization, we impose noise on the latent variables and use a Bayesian estimator to optimize the quantizer-based representation. The introduced bottleneck Bayesian estimator outputs the posterior mean of the centroids to the decoder, and thus, is performing soft quantization of the latent variables. We show that our proposed regularization method results in improved latent representations for both supervised learning and clustering downstream tasks when compared to autoencoders using other bottleneck structures.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Machine Learning

1905.11062

Country:

Europe > France (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(3 more...)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bivariate Beta LSTM

Song, Kyungwoo, Jang, JoonHo, Shin, Seung jae, Moon, Il-Chul

arXiv.org Machine LearningMay-25-2019

Long Short-Term Memory (LSTM) infers the long term dependency through a cell state maintained by the input and the forget gate structures, which models a gate output as a value in [0,1] through a sigmoid function. However, due to the graduality of the sigmoid function, the sigmoid gate is not flexible in representing multi-modality or skewness. Besides, the previous models lack correlation modeling between the gates, which would be a new method to adopt domain knowledge. This paper proposes a new gate structure with the bivariate Beta distribution. The proposed gate structure enables hierarchical probabilistic modeling on the gates within the LSTM cell, so the modelers can customize the cell state flow. Also, we observed that our structured flexible gate modeling is enabled by the probability density estimation. Moreover, we theoretically show and empirically experiment that the bivariate Beta distribution gate structure alleviates the gradient vanishing problem. We demonstrate the effectiveness of bivariate Beta gate structure on the sentence classification, image classification, polyphonic music modeling, and image caption generation.

artificial intelligence, bblstm, machine learning, (17 more...)

arXiv.org Machine Learning

1905.10521

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(16 more...)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment (0.87)
Media > Music (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Magnetoresistive RAM for error resilient XNOR-Nets

Tzoufras, Michail, Gajek, Marcin, Walker, Andrew

arXiv.org Machine LearningMay-24-2019

We trained three Binarized Convolutional Neural Network architectures (LeNet-4, Network-In-Network, AlexNet) on a variety of datasets (MNIST, CIFAR-10, CIFAR-100, extended SVHN, ImageNet) using error-prone activations and tested them without errors to study the resilience of the training process. With the exception of the AlexNet when trained on the ImageNet dataset, we found that Bit Error Rates of a few percent during training do not degrade the test accuracy. Furthermore, by training the AlexNet on progressively smaller subsets of ImageNet classes, we observed increasing tolerance to activation errors. The ability to operate with high BERs is critical for reducing power consumption in existing hardware and for facilitating emerging memory technologies. We discuss how operating at moderate BER can enable Magnetoresistive RAM with higher endurance, speed and density.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Machine Learning

1905.10927

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Fremont (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.37)

Add feedback