Banff
Marginalized State Distribution Entropy Regularization in Policy Optimization
Islam, Riashat, Ahmed, Zafarali, Precup, Doina
Entropy regularization is used to get improved optimization performance in reinforcement learning tasks. A common form of regularization is to maximize policy entropy to avoid premature convergence and lead to more stochastic policies for exploration through action space. However, this does not ensure exploration in the state space. In this work, we instead consider the distribution of discounted weighting of states, and propose to maximize the entropy of a lower bound approximation to the weighting of a state, based on latent space state representation. We propose entropy regularization based on the marginal state distribution, to encourage the policy to have a more uniform distribution over the state space for exploration. Our approach based on marginal state distribution achieves superior state space coverage on complex gridworld domains, that translate into empirical gains in sparse reward 3D maze navigation and continuous control domains compared to entropy regularization with stochastic policies.
Binarized Canonical Polyadic Decomposition for Knowledge Graph Completion
Kishimoto, Koki, Hayashi, Katsuhiko, Akai, Genki, Shimbo, Masashi
Methods based on vector embeddings of knowledge graphs have been actively pursued as a promising approach to knowledge graph completion.However, embedding models generate storage-inefficient representations, particularly when the number of entities and relations, and the dimensionality of the real-valued embedding vectors are large. We present a binarized CANDECOMP/PARAFAC(CP) decomposition algorithm, which we refer to as B-CP, where real-valued parameters are replaced by binary values to reduce model size. Moreover, we show that a fast score computation technique can be developed with bitwise operations. We prove that B-CP is fully expressive by deriving a bound on the size of its embeddings. Experimental results on several benchmark datasets demonstrate that the proposed method successfully reduces model size by more than an order of magnitude while maintaining task performance at the same level as the real-valued CP model.
Less Is Better: Unweighted Data Subsampling via Influence Function
Wang, Zifeng, Zhu, Hong, Dong, Zhenhua, He, Xiuqiang, Huang, Shao-Lun
In the time of \emph{Big Data}, training complex models on large-scale data sets is challenging, making it appealing to reduce data volume for saving computation resources by subsampling. Most previous works in subsampling are weighted methods designed to help the performance of subset-model approach the full-set-model, hence the weighted methods have no chance to acquire a subset-model that is better than the full-set-model. However, we question that \emph{how can we achieve better model with less data?} In this work, we propose a novel Unweighted Influence Data Subsampling (UIDS) method, and prove that the subset-model acquired through our method can outperform the full-set-model. Besides, we show that overly confident on a given test set for sampling is common in Influence-based subsampling methods, which can eventually cause our subset-model's failure in out-of-sample test. To mitigate it, we develop a probabilistic sampling scheme to control the \emph{worst-case risk} over all distributions close to the empirical distribution. The experiment results demonstrate our methods superiority over existed subsampling methods in diverse tasks, such as text classification, image classification, click-through prediction, etc.
An Attribute Oriented Induction based Methodology for Data Driven Predictive Maintenance
Fernandez-Anakabe, Javier, Uriguen, Ekhi Zugasti, Ortega, Urko Zurutuza
Attribute Oriented Induction (AOI) is a data mining algorithm used for extracting knowledge of relational data, taking into account expert knowledge. It is a clustering algorithm that works by transforming the values of the attributes and converting an instance into others that are more generic or ambiguous. In this way, it seeks similarities between elements to generate data groupings. AOI was initially conceived as an algorithm for knowledge discovery in databases, but over the years it has been applied to other areas such as spatial patterns, intrusion detection or strategy making. In this paper, AOI has been extended to the field of Predictive Maintenance. The objective is to demonstrate that combining expert knowledge and data collected from the machine can provide good results in the Predictive Maintenance of industrial assets. To this end we adapted the algorithm and used an LSTM approach to perform both the Anomaly Detection (AD) and the Remaining Useful Life (RUL). The results obtained confirm the validity of the proposal, as the methodology was able to detect anomalies, and calculate the RUL until breakage with considerable degree of accuracy.
Proving Data-Poisoning Robustness in Decision Trees
Drews, Samuel, Albarghouthi, Aws, D'Antoni, Loris
Machine learning models are brittle, and small changes in the training data can result in different predictions. We study the problem of proving that a prediction is robust to data poisoning, where an attacker can inject a number of malicious elements into the training set to influence the learned model. We target decision-tree models, a popular and simple class of machine learning models that underlies many complex learning techniques. We present a sound verification technique based on abstract interpretation and implement it in a tool called Antidote. Antidote abstractly trains decision trees for an intractably large space of possible poisoned datasets. Due to the soundness of our abstraction, Antidote can produce proofs that, for a given input, the corresponding prediction would not have changed had the training set been tampered with or not. We demonstrate the effectiveness of Antidote on a number of popular datasets.
Neural Percussive Synthesis Parameterised by High-Level Timbral Features
Ramires, Antรณnio, Chandna, Pritish, Favory, Xavier, Gรณmez, Emilia, Serra, Xavier
We present a deep neural network-based methodology for synthesising percussive sounds with control over high-level timbral characteristics of the sounds. This approach allows for intuitive control of a synthesizer, enabling the user to shape sounds without extensive knowledge of signal processing. We use a feedforward convolutional neural network-based architecture, which is able to map input parameters to the corresponding waveform. We propose two datasets to evaluate our approach on both a restrictive context, and in one covering a broader spectrum of sounds. The timbral features used as parameters are taken from recent literature in signal processing. We also use these features for evaluation and validation of the presented model, to ensure that changing the input parameters produces a congruent waveform with the desired characteristics. Finally, we evaluate the quality of the output sound using a subjective listening test. We provide sound examples and the system's source code for reproducibility.
DeepMimic: Mentor-Student Unlabeled Data Based Training
Mosafi, Itay, David, Eli, Netanyahu, Nathan S.
In this paper, we present a deep neural network (DNN) training approach called the "DeepMimic" training method. Enormous amounts of data are available nowadays for training usage. Yet, only a tiny portion of these data is manually labeled, whereas almost all of the data are unlabeled. The training approach presented utilizes, in a most simplified manner, the unlabeled data to the fullest, in order to achieve remarkable (classification) results. Our DeepMimic method uses a small portion of labeled data and a large amount of unlabeled data for the training process, as expected in a real-world scenario. It consists of a mentor model and a student model. Employing a mentor model trained on a small portion of the labeled data and then feeding it only with unlabeled data, we show how to obtain a (simplified) student model that reaches the same accuracy and loss as the mentor model, on the same test set, without using any of the original data labels in the training of the student model. Our experiments demonstrate that even on challenging classification tasks the student network architecture can be simplified significantly with a minor influence on the performance, i.e., we need not even know the original network architecture of the mentor. In addition, the time required for training the student model to reach the mentor's performance level is shorter, as a result of a simplified architecture and more available data. The proposed method highlights the disadvantages of regular supervised training and demonstrates the benefits of a less traditional training approach.
Online Robustness Training for Deep Reinforcement Learning
Fischer, Marc, Mirman, Matthew, Stalder, Steven, Vechev, Martin
In deep reinforcement learning (RL), adversarial attacks can trick an agent into unwanted states and disrupt training. We propose a system called Robust Student-DQN (RS-DQN), which permits online robustness training alongside Q networks, while preserving competitive performance. We show that RS-DQN can be combined with (i) state-of-the-art adversarial training and (ii) provably robust training to obtain an agent that is resilient to strong attacks during training and evaluation.
Adversarial Robustness of Flow-Based Generative Models
Pope, Phillip, Balaji, Yogesh, Feizi, Soheil
Flow-based generative models leverage invertible generator functions to fit a distribution to the training data using maximum likelihood. Despite their use in several application domains, robustness of these models to adversarial attacks has hardly been explored. In this paper, we study adversarial robustness of flow-based generative models both theoretically (for some simple models) and empirically (for more complex ones). First, we consider a linear flow-based generative model and compute optimal sample-specific and universal adversarial perturbations that maximally decrease the likelihood scores. Using this result, we study the robustness of the well-known adversarial training procedure, where we characterize the fundamental trade-off between model robustness and accuracy. Next, we empirically study the robustness of two prominent deep, non-linear, flow-based generative models, namely GLOW and RealNVP. We design two types of adversarial attacks; one that minimizes the likelihood scores of in-distribution samples, while the other that maximizes the likelihood scores of out-of-distribution ones. We find that GLOW and RealNVP are extremely sensitive to both types of attacks. Finally, using a hybrid adversarial training procedure, we significantly boost the robustness of these generative models.
Can You Really Backdoor Federated Learning?
Sun, Ziteng, Kairouz, Peter, Suresh, Ananda Theertha, McMahan, H. Brendan
The decentralized nature of federated learning makes detecting and defending against adversarial attacks a challenging task. This paper focuses on backdoor attacks in the federated learning setting, where the goal of the adversary is to reduce the performance of the model on targeted tasks while maintaining good performance on the main task. Unlike existing works, we allow non-malicious clients to have correctly labeled samples from the targeted tasks. We conduct a comprehensive study of backdoor attacks and defenses for the EMNIST dataset, a real-life, user-partitioned, and non-iid dataset. We observe that in the absence of defenses, the performance of the attack largely depends on the fraction of adversaries present and the "complexity'' of the targeted task. Moreover, we show that norm clipping and "weak'' differential privacy mitigate the attacks without hurting the overall performance. We have implemented the attacks and defenses in TensorFlow Federated (TFF), a TensorFlow framework for federated learning. In open-sourcing our code, our goal is to encourage researchers to contribute new attacks and defenses and evaluate them on standard federated datasets.