AITopics | Banff

Collaborating Authors

Banff

Marginalized State Distribution Entropy Regularization in Policy Optimization

Islam, Riashat, Ahmed, Zafarali, Precup, Doina

arXiv.org Machine LearningDec-11-2019

Entropy regularization is used to get improved optimization performance in reinforcement learning tasks. A common form of regularization is to maximize policy entropy to avoid premature convergence and lead to more stochastic policies for exploration through action space. However, this does not ensure exploration in the state space. In this work, we instead consider the distribution of discounted weighting of states, and propose to maximize the entropy of a lower bound approximation to the weighting of a state, based on latent space state representation. We propose entropy regularization based on the marginal state distribution, to encourage the policy to have a more uniform distribution over the state space for exploration. Our approach based on marginal state distribution achieves superior state space coverage on complex gridworld domains, that translate into empirical gains in sparse reward 3D maze navigation and continuous control domains compared to entropy regularization with stochastic policies.

entropy, regularization, state distribution, (15 more...)

arXiv.org Machine Learning

1912.05128

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
North America > Canada > Quebec > Montreal (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Binarized Canonical Polyadic Decomposition for Knowledge Graph Completion

Kishimoto, Koki, Hayashi, Katsuhiko, Akai, Genki, Shimbo, Masashi

arXiv.org Machine LearningDec-4-2019

Methods based on vector embeddings of knowledge graphs have been actively pursued as a promising approach to knowledge graph completion.However, embedding models generate storage-inefficient representations, particularly when the number of entities and relations, and the dimensionality of the real-valued embedding vectors are large. We present a binarized CANDECOMP/PARAFAC(CP) decomposition algorithm, which we refer to as B-CP, where real-valued parameters are replaced by binary values to reduce model size. Moreover, we show that a fast score computation technique can be developed with bitwise operations. We prove that B-CP is fully expressive by deriving a bound on the size of its embeddings. Experimental results on several benchmark datasets demonstrate that the proposed method successfully reduces model size by more than an order of magnitude while maintaining task performance at the same level as the real-valued CP model.

ijk, knowledge graph, proceedings, (15 more...)

arXiv.org Machine Learning

1912.02686

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Netherlands (0.04)
(36 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Less Is Better: Unweighted Data Subsampling via Influence Function

Wang, Zifeng, Zhu, Hong, Dong, Zhenhua, He, Xiuqiang, Huang, Shao-Lun

arXiv.org Machine LearningDec-3-2019

In the time of \emph{Big Data}, training complex models on large-scale data sets is challenging, making it appealing to reduce data volume for saving computation resources by subsampling. Most previous works in subsampling are weighted methods designed to help the performance of subset-model approach the full-set-model, hence the weighted methods have no chance to acquire a subset-model that is better than the full-set-model. However, we question that \emph{how can we achieve better model with less data?} In this work, we propose a novel Unweighted Influence Data Subsampling (UIDS) method, and prove that the subset-model acquired through our method can outperform the full-set-model. Besides, we show that overly confident on a given test set for sampling is common in Influence-based subsampling methods, which can eventually cause our subset-model's failure in out-of-sample test. To mitigate it, we develop a probabilistic sampling scheme to control the \emph{worst-case risk} over all distributions close to the empirical distribution. The experiment results demonstrate our methods superiority over existed subsampling methods in diverse tasks, such as text classification, image classification, click-through prediction, etc.

confidence degree, gradient, probability, (15 more...)

arXiv.org Machine Learning

1912.01321

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(4 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

An Attribute Oriented Induction based Methodology for Data Driven Predictive Maintenance

Fernandez-Anakabe, Javier, Uriguen, Ekhi Zugasti, Ortega, Urko Zurutuza

arXiv.org Machine LearningDec-2-2019

Attribute Oriented Induction (AOI) is a data mining algorithm used for extracting knowledge of relational data, taking into account expert knowledge. It is a clustering algorithm that works by transforming the values of the attributes and converting an instance into others that are more generic or ambiguous. In this way, it seeks similarities between elements to generate data groupings. AOI was initially conceived as an algorithm for knowledge discovery in databases, but over the years it has been applied to other areas such as spatial patterns, intrusion detection or strategy making. In this paper, AOI has been extended to the field of Predictive Maintenance. The objective is to demonstrate that combining expert knowledge and data collected from the machine can provide good results in the Predictive Maintenance of industrial assets. To this end we adapted the algorithm and used an LSTM approach to perform both the Anomaly Detection (AD) and the Remaining Useful Life (RUL). The results obtained confirm the validity of the proposal, as the methodology was able to detect anomalies, and calculate the RUL until breakage with considerable degree of accuracy.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

1912.00662

Country:

North America > United States > Minnesota (0.04)
North America > United States > Hawaii (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (0.88)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Proving Data-Poisoning Robustness in Decision Trees

Drews, Samuel, Albarghouthi, Aws, D'Antoni, Loris

arXiv.org Artificial IntelligenceDec-2-2019

Machine learning models are brittle, and small changes in the training data can result in different predictions. We study the problem of proving that a prediction is robust to data poisoning, where an attacker can inject a number of malicious elements into the training set to influence the learned model. We target decision-tree models, a popular and simple class of machine learning models that underlies many complex learning techniques. We present a sound verification technique based on abstract interpretation and implement it in a tool called Antidote. Antidote abstractly trains decision trees for an intractably large space of possible poisoned datasets. Due to the soundness of our abstraction, Antidote can produce proofs that, for a given input, the corresponding prediction would not have changed had the training set been tampered with or not. We demonstrate the effectiveness of Antidote on a number of popular datasets.

dataset, predicate, robustness, (17 more...)

arXiv.org Artificial Intelligence

1912.00981

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Neural Percussive Synthesis Parameterised by High-Level Timbral Features

Ramires, António, Chandna, Pritish, Favory, Xavier, Gómez, Emilia, Serra, Xavier

arXiv.org Machine LearningNov-25-2019

We present a deep neural network-based methodology for synthesising percussive sounds with control over high-level timbral characteristics of the sounds. This approach allows for intuitive control of a synthesizer, enabling the user to shape sounds without extensive knowledge of signal processing. We use a feedforward convolutional neural network-based architecture, which is able to map input parameters to the corresponding waveform. We propose two datasets to evaluate our approach on both a restrictive context, and in one covering a broader spectrum of sounds. The timbral features used as parameters are taken from recent literature in signal processing. We also use these features for evaluation and validation of the presented model, to ensure that changing the input parameters produces a congruent waveform with the desired characteristics. Finally, we evaluate the quality of the output sound using a subjective listening test. We provide sound examples and the system's source code for reproducibility.

architecture, dataset, waveform, (15 more...)

arXiv.org Machine Learning

1911.11853

Country:

South America > Brazil > Paraná > Curitiba (0.04)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(5 more...)

Genre: Research Report (0.40)

Industry:

Media > Music (0.95)
Leisure & Entertainment (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DeepMimic: Mentor-Student Unlabeled Data Based Training

Mosafi, Itay, David, Eli, Netanyahu, Nathan S.

arXiv.org Machine LearningNov-23-2019

In this paper, we present a deep neural network (DNN) training approach called the "DeepMimic" training method. Enormous amounts of data are available nowadays for training usage. Yet, only a tiny portion of these data is manually labeled, whereas almost all of the data are unlabeled. The training approach presented utilizes, in a most simplified manner, the unlabeled data to the fullest, in order to achieve remarkable (classification) results. Our DeepMimic method uses a small portion of labeled data and a large amount of unlabeled data for the training process, as expected in a real-world scenario. It consists of a mentor model and a student model. Employing a mentor model trained on a small portion of the labeled data and then feeding it only with unlabeled data, we show how to obtain a (simplified) student model that reaches the same accuracy and loss as the mentor model, on the same test set, without using any of the original data labels in the training of the student model. Our experiments demonstrate that even on challenging classification tasks the student network architecture can be simplified significantly with a minor influence on the performance, i.e., we need not even know the original network architecture of the mentor. In addition, the time required for training the student model to reach the mentor's performance level is shorter, as a result of a simplified architecture and more available data. The proposed method highlights the disadvantages of regular supervised training and demonstrates the benefits of a less traditional training approach.

accuracy, dataset, student model, (12 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-030-30493-5_44

1912.00079

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Puerto Rico > San Juan > San Juan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(18 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Online Robustness Training for Deep Reinforcement Learning

Fischer, Marc, Mirman, Matthew, Stalder, Steven, Vechev, Martin

arXiv.org Machine LearningNov-22-2019

In deep reinforcement learning (RL), adversarial attacks can trick an agent into unwanted states and disrupt training. We propose a system called Robust Student-DQN (RS-DQN), which permits online robustness training alongside Q networks, while preserving competitive performance. We show that RS-DQN can be combined with (i) state-of-the-art adversarial training and (ii) provably robust training to obtain an agent that is resilient to strong attacks during training and evaluation.

agent, estpgd, international conference, (12 more...)

arXiv.org Machine Learning

1911.00887

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > British Columbia > Vancouver (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
(11 more...)

Genre:

Research Report (0.50)
Instructional Material (0.34)

Industry:

Leisure & Entertainment > Games (0.47)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Adversarial Robustness of Flow-Based Generative Models

Pope, Phillip, Balaji, Yogesh, Feizi, Soheil

arXiv.org Machine LearningNov-19-2019

Flow-based generative models leverage invertible generator functions to fit a distribution to the training data using maximum likelihood. Despite their use in several application domains, robustness of these models to adversarial attacks has hardly been explored. In this paper, we study adversarial robustness of flow-based generative models both theoretically (for some simple models) and empirically (for more complex ones). First, we consider a linear flow-based generative model and compute optimal sample-specific and universal adversarial perturbations that maximally decrease the likelihood scores. Using this result, we study the robustness of the well-known adversarial training procedure, where we characterize the fundamental trade-off between model robustness and accuracy. Next, we empirically study the robustness of two prominent deep, non-linear, flow-based generative models, namely GLOW and RealNVP. We design two types of adversarial attacks; one that minimizes the likelihood scores of in-distribution samples, while the other that maximizes the likelihood scores of out-of-distribution ones. We find that GLOW and RealNVP are extremely sensitive to both types of attacks. Finally, using a hybrid adversarial training procedure, we significantly boost the robustness of these generative models.

adversarial training, generative model, robustness, (15 more...)

arXiv.org Machine Learning

1911.08654

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(5 more...)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (0.57)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
(2 more...)

Add feedback

Can You Really Backdoor Federated Learning?

Sun, Ziteng, Kairouz, Peter, Suresh, Ananda Theertha, McMahan, H. Brendan

arXiv.org Machine LearningNov-18-2019

The decentralized nature of federated learning makes detecting and defending against adversarial attacks a challenging task. This paper focuses on backdoor attacks in the federated learning setting, where the goal of the adversary is to reduce the performance of the model on targeted tasks while maintaining good performance on the main task. Unlike existing works, we allow non-malicious clients to have correctly labeled samples from the targeted tasks. We conduct a comprehensive study of backdoor attacks and defenses for the EMNIST dataset, a real-life, user-partitioned, and non-iid dataset. We observe that in the absence of defenses, the performance of the attack largely depends on the fraction of adversaries present and the "complexity'' of the targeted task. Moreover, we show that norm clipping and "weak'' differential privacy mitigate the attacks without hurting the overall performance. We have implemented the attacks and defenses in TensorFlow Federated (TFF), a TensorFlow framework for federated learning. In open-sourcing our code, our goal is to encourage researchers to contribute new attacks and defenses and evaluate them on standard federated datasets.

adversary, backdoor attack, backdoor task, (15 more...)

arXiv.org Machine Learning

1911.07963

Country:

North America > United States > California > Los Angeles County > Long Beach (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Virginia (0.04)
(5 more...)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback