Unsupervised or Indirectly Supervised Learning
Multi-Action Dialog Policy Learning from Logged User Feedback
Zhang, Shuo, Zhao, Junzhou, Wang, Pinghui, Wang, Tianxiang, Liang, Zi, Tao, Jing, Huang, Yi, Feng, Junlan
Multi-action dialog policy, which generates multiple atomic dialog actions per turn, has been widely applied in task-oriented dialog systems to provide expressive and efficient system responses. Existing policy models usually imitate action combinations from the labeled multi-action dialog examples. Due to data limitations, they generalize poorly toward unseen dialog flows. While reinforcement learning-based methods are proposed to incorporate the service ratings from real users and user simulators as external supervision signals, they suffer from sparse and less credible dialog-level rewards. To cope with this problem, we explore to improve multi-action dialog policy learning with explicit and implicit turn-level user feedback received for historical predictions (i.e., logged user feedback) that are cost-efficient to collect and faithful to real-world scenarios. The task is challenging since the logged user feedback provides only partial label feedback limited to the particular historical dialog actions predicted by the agent. To fully exploit such feedback information, we propose BanditMatch, which addresses the task from a feedback-enhanced semi-supervised learning perspective with a hybrid objective of semi-supervised learning and bandit learning. BanditMatch integrates pseudo-labeling methods to better explore the action space through constructing full label feedback. Extensive experiments show that our BanditMatch outperforms the state-of-the-art methods by generating more concise and informative responses. The source code and the appendix of this paper can be obtained from https://github.com/ShuoZhangXJTU/BanditMatch.
Generative Adversarial Networks for Malware Detection: a Survey
Dunmore, Aeryn, Jang-Jaccard, Julian, Sabrina, Fariza, Kwak, Jin
Since their proposal in the 2014 paper by Ian Goodfellow, there has been an explosion of research into the area of Generative Adversarial Networks. While they have been utilised in many fields, the realm of malware research is a problem space in which GANs have taken root. From balancing datasets to creating unseen examples in rare classes, GAN models offer extensive opportunities for application. This paper surveys the current research and literature for the use of Generative Adversarial Networks in the malware problem space. This is done with the hope that the reader may be able to gain an overall understanding as to what the Generative Adversarial model provides for this field, and for what areas within malware research it is best utilised. It covers the current related surveys, the different categories of GAN, and gives the outcomes of recent research into optimising GANs for different topics, as well as future directions for exploration.
FedIL: Federated Incremental Learning from Decentralized Unlabeled Data with Convergence Analysis
Yang, Nan, Yuan, Dong, Liu, Charles Z, Deng, Yongkun, Bao, Wei
Most existing federated learning methods assume that clients have fully labeled data to train on, while in reality, it is hard for the clients to get task-specific labels due to users' privacy concerns, high labeling costs, or lack of expertise. This work considers the server with a small labeled dataset and intends to use unlabeled data in multiple clients for semi-supervised learning. We propose a new framework with a generalized model, Federated Incremental Learning (FedIL), to address the problem of how to utilize labeled data in the server and unlabeled data in clients separately in the scenario of Federated Learning (FL). FedIL uses the Iterative Similarity Fusion to enforce the server-client consistency on the predictions of unlabeled data and uses incremental confidence to establish a credible pseudo-label set in each client. We show that FedIL will accelerate model convergence by Cosine Similarity with normalization, proved by Banach Fixed Point Theorem. The code is available at https://anonymous.4open.science/r/fedil.
Novel Class Discovery: an Introduction and Key Concepts
Troisemaine, Colin, Lemaire, Vincent, Gosselin, Stéphane, Reiffers-Masson, Alexandre, Flocon-Cholet, Joachim, Vaton, Sandrine
Novel Class Discovery (NCD) is a growing field where we are given during training a labeled set of known classes and an unlabeled set of different classes that must be discovered. In recent years, many methods have been proposed to address this problem, and the field has begun to mature. In this paper, we provide a comprehensive survey of the state-of-the-art NCD methods. We start by formally defining the NCD problem and introducing important notions. We then give an overview of the different families of approaches, organized by the way they transfer knowledge from the labeled set to the unlabeled set. We find that they either learn in two stages, by first extracting knowledge from the labeled data only and then applying it to the unlabeled data, or in one stage by conjointly learning on both sets. For each family, we describe their general principle and detail a few representative methods. Then, we briefly introduce some new related tasks inspired by the increasing number of NCD works. We also present some common tools and techniques used in NCD, such as pseudo labeling, self-supervised learning and contrastive learning. Finally, to help readers unfamiliar with the NCD problem differentiate it from other closely related domains, we summarize some of the closest areas of research and discuss their main differences.
Q-Match: Self-Supervised Learning by Matching Distributions Induced by a Queue
Mulc, Thomas, Dwibedi, Debidatta
In semi-supervised learning, student-teacher distribution matching has been successful in improving performance of models using unlabeled data in conjunction with few labeled samples. In this paper, we aim to replicate that success in the self-supervised setup where we do not have access to any labeled data during pre-training. We introduce our algorithm, Q-Match, and show it is possible to induce the student-teacher distributions without any knowledge of downstream classes by using a queue of embeddings of samples from the unlabeled dataset. We focus our study on tabular datasets and show that Q-Match outperforms previous self-supervised learning techniques when measuring downstream classification performance. Furthermore, we show that our method is sample efficient-in terms of both the labels required for downstream training and the amount of unlabeled data required for pre-training-and scales well to the sizes of both the labeled and unlabeled data. Tabular data is the most common form of data for problems in industry. While many robust techniques exist to solve real-world machine learning problems on tabular data, most of these techniques require access to labels. Leveraging unlabeled data to learn good representations remains a key open problem in the tabular domain. In this work, we propose a flexible and powerful framework using deep learning that helps us use unlabeled data in the tabular domain. Deep learning has been successful in processing data in many different domains like images, audio, and text.
Preventing Catastrophic Forgetting in Continual Learning of New Natural Language Tasks
Kar, Sudipta, Castellucci, Giuseppe, Filice, Simone, Malmasi, Shervin, Rokhlenko, Oleg
Multi-Task Learning (MTL) is widely-accepted in Natural Language Processing as a standard technique for learning multiple related tasks in one model. Training an MTL model requires having the training data for all tasks available at the same time. As systems usually evolve over time, (e.g., to support new functionalities), adding a new task to an existing MTL model usually requires retraining the model from scratch on all the tasks and this can be time-consuming and computationally expensive. Moreover, in some scenarios, the data used to train the original training may be no longer available, for example, due to storage or privacy concerns. In this paper, we approach the problem of incrementally expanding MTL models' capability to solve new tasks over time by distilling the knowledge of an already trained model on n tasks into a new one for solving n+1 tasks. To avoid catastrophic forgetting, we propose to exploit unlabeled data from the same distributions of the old tasks. Our experiments on publicly available benchmarks show that such a technique dramatically benefits the distillation by preserving the already acquired knowledge (i.e., preventing up to 20% performance drops on old tasks) while obtaining good performance on the incrementally added tasks. Further, we also show that our approach is beneficial in practical settings by using data from a leading voice assistant.
Unsupervised Learning on a DIET: Datum IndEx as Target Free of Self-Supervision, Reconstruction, Projector Head
Costly, noisy, and over-specialized, labels are to be set aside in favor of unsupervised learning if we hope to learn cheap, reliable, and transferable models. To that end, spectral embedding, self-supervised learning, or generative modeling have offered competitive solutions. Those methods however come with numerous challenges \textit{e.g.} estimating geodesic distances, specifying projector architectures and anti-collapse losses, or specifying decoder architectures and reconstruction losses. In contrast, we introduce a simple explainable alternative -- coined \textbf{DIET} -- to learn representations from unlabeled data, free of those challenges. \textbf{DIET} is blatantly simple: take one's favorite classification setup and use the \textbf{D}atum \textbf{I}nd\textbf{E}x as its \textbf{T}arget class, \textit{i.e. each sample is its own class}, no further changes needed. \textbf{DIET} works without a decoder/projector network, is not based on positive pairs nor reconstruction, introduces no hyper-parameters, and works out-of-the-box across datasets and architectures. Despite \textbf{DIET}'s simplicity, the learned representations are of high-quality and often on-par with the state-of-the-art \textit{e.g.} using a linear classifier on top of DIET's learned representation reaches $71.4\%$ on CIFAR100 with a Resnet101, $52.5\%$ on TinyImagenet with a Resnext50.
New Research on Generative adversarial networks part5(Machine Learning 2023)
Abstract: We present LM-GAN, an HDR sky model that generates photorealistic environment maps with weathered skies. Our sky model retains the flexibility of traditional parametric models and enables the reproduction of photorealistic all-weather skies with visual diversity in cloud formations. This is achieved with flexible and intuitive user controls for parameters, including sun position, sky color, and atmospheric turbidity. Our method is trained directly from inputs fitted to real HDR skies, learning both to preserve the input's illumination and correlate it to the real reference's atmospheric components in an end-to-end manner. Our main contributions are a generative model trained on both sky appearance and scene rendering losses, as well as a novel sky-parameter fitting algorithm.
New Research on Generative adversarial networks part4(Machine Learning 2023)
Abstract: The conventional understanding of adversarial training in generative adversarial networks (GANs) is that the discriminator is trained to estimate a divergence, and the generator learns to minimize this divergence. We argue that despite the fact that many variants of GANs were developed following this paradigm, the current theoretical understanding of GANs and their practical algorithms are inconsistent. In this paper, we leverage Wasserstein gradient flows which characterize the evolution of particles in the sample space, to gain theoretical insights and algorithmic inspiration of GANs. We introduce a unified generative modeling framework -- MonoFlow: the particle evolution is rescaled via a monotonically increasing mapping of the log density ratio. Under our framework, adversarial training can be viewed as a procedure first obtaining MonoFlow's vector field via training the discriminator and the generator learns to draw the particle flow defined by the corresponding vector field.
Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding
Li, Shiyang, Yavuz, Semih, Chen, Wenhu, Yan, Xifeng
Task-adaptive pre-training (TAPT) and Self-training (ST) have emerged as the major semi-supervised approaches to improve natural language understanding (NLU) tasks with massive amount of unlabeled data. However, it's unclear whether they learn similar representations or they can be effectively combined. In this paper, we show that TAPT and ST can be complementary with simple TFS protocol by following TAPT -> Finetuning -> Self-training (TFS) process. Experimental results show that TFS protocol can effectively utilize unlabeled data to achieve strong combined gains consistently across six datasets covering sentiment classification, paraphrase identification, natural language inference, named entity recognition and dialogue slot classification. We investigate various semi-supervised settings and consistently show that gains from TAPT and ST can be strongly additive by following TFS procedure. We hope that TFS could serve as an important semi-supervised baseline for future NLP studies.