Overview
Game of GANs: Game Theoretical Models for Generative Adversarial Networks
Moghadam, Monireh Mohebbi, Boroomand, Bahar, Jalali, Mohammad, Zareian, Arman, DaeiJavad, Alireza, Manshaei, Mohammad Hossein
Generative Adversarial Network, as a promising research direction in the AI community, recently attracts considerable attention due to its ability to generating high-quality realistic data. GANs are a competing game between two neural networks trained in an adversarial manner to reach a Nash equilibrium. Despite the improvement accomplished in GANs in the last years, there remain several issues to solve. In this way, how to tackle these issues and make advances leads to rising research interests. This paper reviews literature that leverages the game theory in GANs and addresses how game models can relieve specific generative models' challenges and improve the GAN's performance. In particular, we firstly review some preliminaries, including the basic GAN model and some game theory backgrounds. After that, we present our taxonomy to summarize the state-of-the-art solutions into three significant categories: modified game model, modified architecture, and modified learning method. The classification is based on the modifications made in the basic model by the proposed approaches from the game-theoretic perspective. We further classify each category into several subcategories. Following the proposed taxonomy, we explore the main objective of each class and review the recent work in each group. Finally, we discuss the remaining challenges in this field and present the potential future research topics.
Zero-Shot Controlled Generation with Encoder-Decoder Transformers
Hazarika, Devamanyu, Namazifar, Mahdi, Hakkani-Tรผr, Dilek
Controlling neural network-based models for natural language generation (NLG) has broad applications in numerous areas such as machine translation, document summarization, and dialog systems. Approaches that enable such control in a zero-shot manner would be of great importance as, among other reasons, they remove the need for additional annotated data and training. In this work, we propose novel approaches for controlling encoder-decoder transformer-based NLG models in zero-shot. This is done by introducing three control knobs, namely, attention biasing, decoder mixing, and context augmentation, that are applied to these models at generation time. These knobs control the generation process by directly manipulating trained NLG models (e.g., biasing cross-attention layers) to realize the desired attributes in the generated outputs. We show that not only are these NLG models robust to such manipulations, but also their behavior could be controlled without an impact on their generation performance. These results, to the best of our knowledge, are the first of their kind. Through these control knobs, we also investigate the role of transformer decoder's self-attention module and show strong evidence that its primary role is maintaining fluency of sentences generated by these models. Based on this hypothesis, we show that alternative architectures for transformer decoders could be viable options. We also study how this hypothesis could lead to more efficient ways for training encoder-decoder transformer models.
Avnet Empowers the Artificial Intelligence Ecosystem with its Partners - ELE Times
Leading global technology distributor and solutions provider Avnet Asia will host the "Avnet AI Cloud Exhibition", showcasing innovative technology, applications and solutions in Artificial Intelligence (AI) and machine learning together with its suppliers and partners. With the ability to quickly design, develop and deploy solutions, Avnet can meet the needs of a variety of application scenarios to accelerate the industrialization of artificial intelligence. During this period, Avnet will also hold the "Avnet 2021 Artificial Intelligence Cloud Conference" on June 29, 2021. Joined by developers, engineers, and decision makers in the AI field, the summit will feature cutting-edge technology trends in artificial intelligence and machine learning, and in-depth discussions on the development, future prospects and blueprints for AI to encourage and accelerate innovation. KS Lim, senior director of supplier management at Avnet Asia said, "MarketsandMarkets forecasts the global artificial intelligence (AI) market size to grow to over USD 300 billion by 2026, and the market in Asia Pacific is anticipated to grow at the highest CAGR during the forecast period. As the world's leading technology distributor and solution provider, Avnet has a comprehensive ecosystem that provides customers with end-to-end artificial intelligence and machine learning solutions, reducing the cost and complexity of product development to enable application scenarios. We will continue to work hand in hand with our suppliers and partners to further contribute to the development and maturity of the entire AI ecosystem."
Boosting in the Presence of Massart Noise
Diakonikolas, Ilias, Impagliazzo, Russell, Kane, Daniel, Lei, Rex, Sorrell, Jessica, Tzamos, Christos
We study the problem of boosting the accuracy of a weak learner in the (distribution-independent) PAC model with Massart noise. In the Massart noise model, the label of each example $x$ is independently misclassified with probability $\eta(x) \leq \eta$, where $\eta<1/2$. The Massart model lies between the random classification noise model and the agnostic model. Our main positive result is the first computationally efficient boosting algorithm in the presence of Massart noise that achieves misclassification error arbitrarily close to $\eta$. Prior to our work, no non-trivial booster was known in this setting. Moreover, we show that this error upper bound is best possible for polynomial-time black-box boosters, under standard cryptographic assumptions. Our upper and lower bounds characterize the complexity of boosting in the distribution-independent PAC model with Massart noise. As a simple application of our positive result, we give the first efficient Massart learner for unions of high-dimensional rectangles.
Certification of embedded systems based on Machine Learning: A survey
Vidot, Guillaume, Gabreau, Christophe, Ober, Ileana, Ober, Iulian
Nevertheless, the recent advances in machine learning triggered genuine interest, as machine learning offer promising preliminary results and open the way to a wide range of new functions for avionics systems, for instance in the area of autonomous flying. In this paper we investigate on how existing certification and regulation techniques, can (or cannot) handle software development that includes parts obtained by machine learning. Nowadays a large aircraft cockpit offers many avionic complex functions: flight controls, navigation, surveillance, communications, displays... Their design has required a top down iterative approach from aircraft level downward, thus the functions are performed by systems of systems, with each system decomposed into subsystems that may contain a collection of software and hardware items. Therefore, any avionic development considers 3 levels of engineering: (i) Function, (ii) System/Subsystem and (iii) Item. The development process of each engineering level relies on several decades of experience and good practices that keep on being adapted today.
Outlier detection in multivariate functional data through a contaminated mixture model
Amovin-Assagba, Martial, Gannaz, Irรจne, Jacques, Julien
This work is motivated by an application in an industrial context, where the activity of sensors is recorded at a high frequency. The objective is to automatically detect abnormal measurement behaviour. Considering the sensor measures as functional data, we are formally interested in detecting outliers in a multivariate functional data set. Due to the heterogeneity of this data set, the proposed contaminated mixture model both clusters the multivariate functional data into homogeneous groups and detects outliers. The main advantage of this procedure over its competitors is that it does not require us to specify the proportion of outliers. Model inference is performed through an Expectation-Conditional Maximization algorithm, and the BIC criterion is used to select the number of clusters. Numerical experiments on simulated data demonstrate the high performance achieved by the inference algorithm. In particular, the proposed model outperforms competitors. Its application on the real data which motivated this study allows us to correctly detect abnormal behaviours.
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Chen, Jiaao, Tam, Derek, Raffel, Colin, Bansal, Mohit, Yang, Diyi
NLP has achieved great progress in the past decade through the use of neural models and large labeled datasets. The dependence on abundant data prevents NLP models from being applied to low-resource settings or novel tasks where significant time, money, or expertise is required to label massive amounts of textual data. Recently, data augmentation methods have been explored as a means of improving data efficiency in NLP. To date, there has been no systematic empirical overview of data augmentation for NLP in the limited labeled data setting, making it difficult to understand which methods work in which settings. In this paper, we provide an empirical survey of recent progress on data augmentation for NLP in the limited labeled data setting, summarizing the landscape of methods (including token-level augmentations, sentence-level augmentations, adversarial augmentations, and hidden-space augmentations) and carrying out experiments on 11 datasets covering topics/news classification, inference tasks, paraphrasing tasks, and single-sentence tasks. Based on the results, we draw several conclusions to help practitioners choose appropriate augmentations in different settings and discuss the current challenges and future directions for limited data learning in NLP.
Invariant Information Bottleneck for Domain Generalization
Li, Bo, Shen, Yifei, Wang, Yezhen, Zhu, Wenzhen, Reed, Colorado J., Zhang, Jun, Li, Dongsheng, Keutzer, Kurt, Zhao, Han
The main challenge for domain generalization (DG) is to overcome the potential distributional shift between multiple training domains and unseen test domains. One popular class of DG algorithms aims to learn representations that have an invariant causal relation across the training domains. However, certain features, called \emph{pseudo-invariant features}, may be invariant in the training domain but not the test domain and can substantially decreases the performance of existing algorithms. To address this issue, we propose a novel algorithm, called Invariant Information Bottleneck (IIB), that learns a minimally sufficient representation that is invariant across training and testing domains. By minimizing the mutual information between the representation and inputs, IIB alleviates its reliance on pseudo-invariant features, which is desirable for DG. To verify the effectiveness of the IIB principle, we conduct extensive experiments on large-scale DG benchmarks. The results show that IIB outperforms invariant learning baseline (e.g. IRM) by an average of 2.8\% and 3.8\% accuracy over two evaluation metrics.
The Essential Guide to Transformers, the Key to Modern SOTA AI - KDnuggets
Are you overwhelmed by the vast array of X-formers? X-formers are the name being given to the wide array of Transformer variants that have been implemented or proposed. You likely know Transformers from their recent spate of success stories in natural language processing, computer vision, and other areas of artificial intelligence, but are familiar with all of the X-formers? More importantly, do you know the differences, and why you might use one over another? A Survey of Transformers, by Tianyang Lin, Yuxin Wang, Xiangyang Liu, and Xipeng Qiu, has been written to help interested readers in this regard.
Active Learning for Network Traffic Classification: A Technical Survey
Shahraki, Amin, Abbasi, Mahmoud, Taherkordi, Amir, Jurcut, Anca Delia
Network Traffic Classification (NTC) has become an important component in a wide variety of network management operations, e.g., Quality of Service (QoS) provisioning and security purposes. Machine Learning (ML) algorithms as a common approach for NTC methods can achieve reasonable accuracy and handle encrypted traffic. However, ML-based NTC techniques suffer from the shortage of labeled traffic data which is the case in many real-world applications. This study investigates the applicability of an active form of ML, called Active Learning (AL), which reduces the need for a high number of labeled examples by actively choosing the instances that should be labeled. The study first provides an overview of NTC and its fundamental challenges along with surveying the literature in the field of using ML techniques in NTC. Then, it introduces the concepts of AL, discusses it in the context of NTC, and review the literature in this field. Further, challenges and open issues in the use of AL for NTC are discussed. Additionally, as a technical survey, some experiments are conducted to show the broad applicability of AL in NTC. The simulation results show that AL can achieve high accuracy with a small amount of data.