Banff
Reconstruction of Incomplete Wildfire Data using Deep Generative Models
We present our submission to the Extreme Value Analysis 2021 Data Challenge in which teams were asked to accurately predict distributions of wildfire frequency and size within spatio-temporal regions of missing data. For the purpose of this competition we developed a variant of the powerful variational autoencoder models dubbed the Conditional Missing data Importance-Weighted Autoencoder (CMIWAE). Our deep latent variable generative model requires little to no feature engineering and does not necessarily rely on the specifics of scoring in the Data Challenge. It is fully trained on incomplete data, with the single objective to maximize log-likelihood of the observed wildfire information. We mitigate the effects of the relatively low number of training samples by stochastic sampling from a variational latent variable distribution, as well as by ensembling a set of CMIWAE models trained and validated on different splits of the provided data. The presented approach is not domain-specific and is amenable to application in other missing data recovery tasks with tabular or image-like information conditioned on auxiliary information.
Data augmentation through multivariate scenario forecasting in Data Centers using Generative Adversarial Networks
Pérez, Jaime, Arroba, Patricia, Moya, José M.
The Cloud paradigm is at a critical point in which the existing energy-efficiency techniques are reaching a plateau, while the computing resources demand at Data Center facilities continues to increase exponentially. The main challenge in achieving a global energy efficiency strategy based on Artificial Intelligence is that we need massive amounts of data to feed the algorithms. Nowadays, any optimization strategy must begin with data. However, companies with access to these large amounts of data decide not to share them because it could compromise their security. This paper proposes a time-series data augmentation methodology based on synthetic scenario forecasting within the Data Center. For this purpose, we will implement a powerful generative algorithm: Generative Adversarial Networks (GANs). The use of GANs will allow us to handle multivariate data and data from different natures (e.g., categorical). On the other hand, adapting Data Centers' operational management to the occurrence of sporadic anomalies is complicated due to the reduced frequency of failures in the system. Therefore, we also propose a methodology to increase the generated data variability by introducing on-demand anomalies. We validated our approach using real data collected from an operating Data Center, successfully obtaining forecasts of random scenarios with several hours of prediction. Our research will help to optimize the energy consumed in Data Centers, although the proposed methodology can be employed in any similar time-series-like problem.
Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019
Liu, Zhengying, Pavao, Adrien, Xu, Zhen, Escalera, Sergio, Ferreira, Fabio, Guyon, Isabelle, Hong, Sirui, Hutter, Frank, Ji, Rongrong, Junior, Julio C. S. Jacques, Li, Ge, Lindauer, Marius, Luo, Zhipeng, Madadi, Meysam, Nierhoff, Thomas, Niu, Kangning, Pan, Chunguang, Stoll, Danny, Treguer, Sebastien, Wang, Jin, Wang, Peng, Wu, Chenglin, Xiong, Youcheng, Zela, Arbe r, Zhang, Yang
This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a "meta-learner", "data ingestor", "model selector", "model/learner", and "evaluator". This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free "AutoDL self-service".
Towards Intrinsic Interactive Reinforcement Learning
Meanwhile, applications of RL have only begun to expand beyond these constrained game environments to more diverse and complex real-world environments such as chip design [86], chemical reaction optimization [133] and performing long-term recommendations [45]. To further progress towards these more complex real-world environments, greater alleviation of challenges currently facing RL (e.g., generalization, robustness, scalability, and safety) is needed [7, 27, 72, 108]. Moreover, we can expect that as the complexity of environments increases, the difficulty in alleviating these challenges will increase as well [27]. For the purpose of this paper, we broadly define known RL challenges as either an aptitude or alignment problem. Aptitude encompasses challenges concerned with being able to learn. Aptitude includes ideas such as robustness, the ability of RL to perform a task (e.g., asymptotic performance) and generalize within/between environments of similar complexity; scalability, the ability of RL to scale up to more complex environment; and aptness, the rate at which a RL algorithm can learn to solve a problem or achieve a desired performance level. Likewise, alignment encompasses challenges concerned with learning as intended [7, 27, 72]. The hypothetical paperclip agent [18] is a classic example of misalignment.
Towards Understanding and Harnessing the Effect of Image Transformation in Adversarial Detection
Liu, Hui, Zhao, Bo, Peng, Yuefeng, Li, Weidong, Liu, Peng
Deep neural networks (DNNs) are threatened by adversarial examples. Adversarial detection, which distinguishes adversarial images from benign images, is fundamental for robust DNN-based services. Image transformation is one of the most effective approaches to detect adversarial examples. During the last few years, a variety of image transformations have been studied and discussed to design reliable adversarial detectors. In this paper, we systematically synthesize the recent progress on adversarial detection via image transformations with a novel classification method. Then, we conduct extensive experiments to test the detection performance of image transformations against state-of-the-art adversarial attacks. Furthermore, we reveal that each individual transformation is not capable of detecting adversarial examples in a robust way, and propose a DNN-based approach referred to as AdvJudge, which combines scores of 9 image transformations. Without knowing which individual scores are misleading or not misleading, AdvJudge can make the right judgment, and achieve a significant improvement in detection accuracy. We claim that AdvJudge is a more effective adversarial detector than those based on an individual image transformation.
On the Real-World Adversarial Robustness of Real-Time Semantic Segmentation Models for Autonomous Driving
Rossolini, Giulio, Nesti, Federico, D'Amico, Gianluca, Nair, Saasha, Biondi, Alessandro, Buttazzo, Giorgio
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks such as visual perception in autonomous driving. This paper presents an extensive evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches, including digital, simulated, and physical ones. A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels. Also, a novel attack strategy is presented to improve the Expectation Over Transformation method for placing a patch in the scene. Finally, a state-of-the-art method for detecting adversarial patch is first extended to cope with semantic segmentation models, then improved to obtain real-time performance, and eventually evaluated in real-world scenarios. Experimental results reveal that, even though the adversarial effect is visible with both digital and real-world attacks, its impact is often spatially confined to areas of the image around the patch. This opens to further questions about the spatial robustness of real-time semantic segmentation models.
Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in Dynamic Environments
Iyer, Abhiram, Grewal, Karan, Velu, Akash, Souza, Lucas Oliveira, Forest, Jeremy, Ahmad, Subutai
A key challenge for AI is to build embodied systems that operate in dynamically changing environments. Such systems must adapt to changing task contexts and learn continuously. Although standard deep learning systems achieve state of the art results on static benchmarks, they often struggle in dynamic scenarios. In these settings, error signals from multiple contexts can interfere with one another, ultimately leading to a phenomenon known as catastrophic forgetting. In this article we investigate biologically inspired architectures as solutions to these problems. Specifically, we show that the biophysical properties of dendrites and local inhibitory systems enable networks to dynamically restrict and route information in a context-specific manner. Our key contributions are as follows. First, we propose a novel artificial neural network architecture that incorporates active dendrites and sparse representations into the standard deep learning framework. Next, we study the performance of this architecture on two separate benchmarks requiring task-based adaptation: Meta-World, a multi-task reinforcement learning environment where a robotic agent must learn to solve a variety of manipulation tasks simultaneously; and a continual learning benchmark in which the model's prediction task changes throughout training. Analysis on both benchmarks demonstrates the emergence of overlapping but distinct and sparse subnetworks, allowing the system to fluidly learn multiple tasks with minimal forgetting. Our neural implementation marks the first time a single architecture has achieved competitive results on both multi-task and continual learning settings. Our research sheds light on how biological properties of neurons can inform deep learning systems to address dynamic scenarios that are typically impossible for traditional ANNs to solve.
Beta-VAE Reproducibility: Challenges and Extensions
Fil, Miroslav, Mesinovic, Munib, Morris, Matthew, Wildberger, Jonas
Unsupervised learning is known to be brittle even on toy datasets and a meaningful, mathematically precise definition of disentanglement remains difficult to find. Here we investigate the original β-VAE paper and add evidence to the results previously obtained indicating its lack of reproducibility. We also further expand the experimentation of the models and include further more complex datasets in the analysis. We also implement an FID scoring metric for the β-VAE model and conclude a qualitative analysis of the results obtained. We end with a brief discussion on possible future investigations that can be conducted to add more robustness to the claims. Variational autoencoders (Kingma & Welling, 2014) are a class of unsupervised representation learning models with a principled probabilistic interpretation that extends normal autoencoders first described by Hinton & Salakhutdinov (2006). However, unsupervised learning is notoriously brittle even on toy datasets and a meaningful, mathematically precise definition of disentanglement remains difficult to find. It is thus not obvious to what extent β-VAEs can robustly obtain disentangled representations in different settings.
Toward a New Science of Common Sense
Brachman, Ronald J., Levesque, Hector J.
Common sense has always been of interest in AI, but has rarely taken center stage. Despite its mention in one of John McCarthy's earliest papers and years of work by dedicated researchers, arguably no AI system with a serious amount of general common sense has ever emerged. Why is that? What's missing? Examples of AI systems' failures of common sense abound, and they point to AI's frequent focus on expertise as the cause. Those attempting to break the brittleness barrier, even in the context of modern deep learning, have tended to invest their energy in large numbers of small bits of commonsense knowledge. But all the commonsense knowledge fragments in the world don't add up to a system that actually demonstrates common sense in a human-like way. We advocate examining common sense from a broader perspective than in the past. Common sense is more complex than it has been taken to be and is worthy of its own scientific exploration.
Adversarial Attacks against Windows PE Malware Detection: A Survey of the State-of-the-Art
Ling, Xiang, Wu, Lingfei, Zhang, Jiangyu, Qu, Zhenqing, Deng, Wei, Chen, Xiang, Wu, Chunming, Ji, Shouling, Luo, Tianyue, Wu, Jingzheng, Wu, Yanjun
The malware has been being one of the most damaging threats to computers that span across multiple operating systems and various file formats. To defend against the ever-increasing and ever-evolving threats of malware, tremendous efforts have been made to propose a variety of malware detection methods that attempt to effectively and efficiently detect malware. Recent studies have shown that, on the one hand, existing ML and DL enable the superior detection of newly emerging and previously unseen malware. However, on the other hand, ML and DL models are inherently vulnerable to adversarial attacks in the form of adversarial examples, which are maliciously generated by slightly and carefully perturbing the legitimate inputs to confuse the targeted models. Basically, adversarial attacks are initially extensively studied in the domain of computer vision, and some quickly expanded to other domains, including NLP, speech recognition and even malware detection. In this paper, we focus on malware with the file format of portable executable (PE) in the family of Windows operating systems, namely Windows PE malware, as a representative case to study the adversarial attack methods in such adversarial settings. To be specific, we start by first outlining the general learning framework of Windows PE malware detection based on ML/DL and subsequently highlighting three unique challenges of performing adversarial attacks in the context of PE malware. We then conduct a comprehensive and systematic review to categorize the state-of-the-art adversarial attacks against PE malware detection, as well as corresponding defenses to increase the robustness of PE malware detection. We conclude the paper by first presenting other related attacks against Windows PE malware detection beyond the adversarial attacks and then shedding light on future research directions and opportunities.