Unsupervised or Indirectly Supervised Learning
Understanding CycleGANs using examples & codes
After covering basic GANs (with a sample model) in my last post, taking a step further, we will explore an advanced GAN version i.e CycleGAN having some fascinating application in the field of image translation So, the idea is if we have an image of a house in the Summer season, it should get translated into an image depicting the winter season keeping the house as it is. How can this be done? But if you ponder for a minute, preparing such a training dataset is a monumental task. For example: for the Summer to Winter translation of the same location, you may need to wait for a year to get pictures. In other cases, the Y just doesn't exist.
Lexico-semantic and affective modelling of Spanish poetry: A semi-supervised learning approach
Barbado, Alberto, Gonzรกlez, Marรญa Dolores, Carrera, Dรฉbora
Text classification tasks have improved substantially during the last years by the usage of transformers. However, the majority of researches focus on prose texts, with poetry receiving less attention, specially for Spanish language. In this paper, we propose a semi-supervised learning approach for inferring 21 psychological categories evoked by a corpus of 4572 sonnets, along with 10 affective and lexico-semantic multiclass ones. The subset of poems used for training an evaluation includes 270 sonnets. With our approach, we achieve an AUC beyond 0.7 for 76% of the psychological categories, and an AUC over 0.65 for 60% on the multiclass ones. The sonnets are modelled using transformers, through sentence embeddings, along with lexico-semantic and affective features, obtained by using external lexicons. Consequently, we see that this approach provides an AUC increase of up to 0.12, as opposed to using transformers alone.
Online Unsupervised Learning of Visual Representations and Categories
Ren, Mengye, Scott, Tyler R., Iuzzolino, Michael L., Mozer, Michael C., Zemel, Richard
Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution. Furthermore, real world interactions demand learning on-the-fly from few or no class labels. In this work, we propose an unsupervised model that simultaneously performs online visual representation learning and few-shot learning of new categories without relying on any class labels. Our model is a prototype-based memory network with a control component that determines when to form a new class prototype. We formulate it as an online Gaussian mixture model, where components are created online with only a single new example, and assignments do not have to be balanced, which permits an approximation to natural imbalanced distributions from uncurated raw data. Learning includes a contrastive loss that encourages different views of the same image to be assigned to the same prototype. The result is a mechanism that forms categorical representations of objects in nonstationary environments. Experiments show that our method can learn from an online stream of visual input data and is significantly better at category recognition compared to state-of-the-art self-supervised learning methods.
Few-shot Learning via Dependency Maximization and Instance Discriminant Analysis
We study the few-shot learning (FSL) problem, where a model learns to recognize new objects with extremely few labeled training data per category. Most of previous FSL approaches resort to the meta-learning paradigm, where the model accumulates inductive bias through learning many training tasks so as to solve a new unseen few-shot task. In contrast, we propose a simple approach to exploit unlabeled data accompanying the few-shot task for improving few-shot performance. Firstly, we propose a Dependency Maximization method based on the Hilbert-Schmidt norm of the cross-covariance operator, which maximizes the statistical dependency between the embedded feature of those unlabeled data and their label predictions, together with the supervised loss over the support set. We then use the obtained model to infer the pseudo-labels for those unlabeled data. Furthermore, we propose anInstance Discriminant Analysis to evaluate the credibility of each pseudo-labeled example and select the most faithful ones into an augmented support set to retrain the model as in the first step. We iterate the above process until the pseudo-labels for the unlabeled data becomes stable. Following the standard transductive and semi-supervised FSL setting, our experiments show that the proposed method out-performs previous state-of-the-art methods on four widely used benchmarks, including mini-ImageNet, tiered-ImageNet, CUB, and CIFARFS.
Review -- Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction
By performing this pretext task of predicting X2 from X1, we hope to achieve a representation F(X1) which contains high-level abstractions or semantics. By concatenating the representations layer-wise, Fl {Fl1, Fl2}, a representation F is achieved which is pretrained on full input tensor X. If F is a CNN of a desired fixed size, e.g., AlexNet, we can design the subnetworks F1, F2 by splitting each layer of the network F in half, along the channel dimension. However, it is found that the proposed Split-Brain Auto (Section 1.2) outperforms the above two alternatives (Section 1.3).
A Two-stage Complex Network using Cycle-consistent Generative Adversarial Networks for Speech Enhancement
Yu, Guochen, Wang, Yutian, Wang, Hui, Zhang, Qin, Zheng, Chengshi
Cycle-consistent generative adversarial networks (CycleGAN) have shown their promising performance for speech enhancement (SE), while one intractable shortcoming of these CycleGAN-based SE systems is that the noise components propagate throughout the cycle and cannot be completely eliminated. Additionally, conventional CycleGAN-based SE systems only estimate the spectral magnitude, while the phase is unaltered. Motivated by the multi-stage learning concept, we propose a novel two-stage denoising system that combines a CycleGAN-based magnitude enhancing network and a subsequent complex spectral refining network in this paper. Specifically, in the first stage, a CycleGAN-based model is responsible for only estimating magnitude, which is subsequently coupled with the original noisy phase to obtain a coarsely enhanced complex spectrum. After that, the second stage is applied to further suppress the residual noise components and estimate the clean phase by a complex spectral mapping network, which is a pure complex-valued network composed of complex 2D convolution/deconvolution and complex temporal-frequency attention blocks. Experimental results on two public datasets demonstrate that the proposed approach consistently surpasses previous one-stage CycleGANs and other state-of-the-art SE systems in terms of various evaluation metrics, especially in background noise suppression.
Supervised vs Unsupervised Learning, Explained
In this article, I'll explain supervised vs unsupervised learning. The tutorial will start by discussing some foundational concepts and then it will explain supervised and unsupervised learning separately, in more detail. If you need something specific, just click on the link. The following links will take you to specific sections of the article. Having said that, if you're confused about supervised vs unsupervised learning, you'll probably want to read the whole article from start to finish. If you're somewhat new to machine learning, you've probably heard the terms "supervised" and "unsupervised" learning.
Dash: Semi-Supervised Learning with Dynamic Thresholding
Xu, Yi, Shang, Lei, Ye, Jinxing, Qian, Qi, Li, Yu-Feng, Sun, Baigui, Li, Hao, Jin, Rong
While semi-supervised learning (SSL) has received tremendous attentions in many machine learning tasks due to its successful use of unlabeled data, existing SSL algorithms use either all unlabeled examples or the unlabeled examples with a fixed high-confidence prediction during the training progress. However, it is possible that too many correct/wrong pseudo labeled examples are eliminated/selected. In this work we develop a simple yet powerful framework, whose key idea is to select a subset of training examples from the unlabeled data when performing existing SSL methods so that only the unlabeled examples with pseudo labels related to the labeled data will be used to train models. The selection is performed at each updating iteration by only keeping the examples whose losses are smaller than a given threshold that is dynamically adjusted through the iteration. Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection and its theoretical guarantee. Specifically, we theoretically establish the convergence rate of Dash from the view of non-convex optimization. Finally, we empirically demonstrate the effectiveness of the proposed method in comparison with state-of-the-art over benchmarks.
A Survey on Deep Semi-supervised Learning
Yang, Xiangli, Song, Zixing, King, Irwin, Xu, Zenglin
Deep semi-supervised learning is a fast-growing field with a range of practical applications. This paper provides a comprehensive survey on both fundamentals and recent advances in deep semi-supervised learning methods from perspectives of model design and unsupervised loss functions. We first present a taxonomy for deep semi-supervised learning that categorizes existing methods, including deep generative methods, consistency regularization methods, graph-based methods, pseudo-labeling methods, and hybrid methods. Then we provide a comprehensive review of 52 representative methods and offer a detailed comparison of these methods in terms of the type of losses, contributions, and architecture differences. In addition to the progress in the past few years, we further discuss some shortcomings of existing methods and provide some tentative heuristic solutions for solving these open problems.
On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity
Le, Khang, Nguyen, Huy, Pham, Tung, Ho, Nhat
The recent advances in computation of optimal transport (OT) [34, 11, 24, 12, 1] has brought new applications of optimal transport in machine learning and data science to the fore. Examples of these applications include generative models [2, 33, 16, 10], unsupervised learning [17, 18], computer vision [32, 25], and other applications [29, 27, 6]. However, due to the marginal constraints of transportation plans, optimal transport is only defined between balanced measures, namely, measures with equal masses. When measures are unbalanced, i.e., they can have different masses, there are two popular approaches for defining divergences between these measures. The first approach is unbalanced optimal transport [9, 8]. The main idea of unbalanced optimal transport is to regularize the objective function of optimal transport based on certain divergences between marginal constraints of transportation plan and the masses of measures. Despite its favorable computational complexity [28] and practical applications [31, 14, 19, 3], the optimal transportation plan from unbalanced optimal transport is often non-trivial to interpret in practice. The second approach for defining divergence between unbalanced measures is partial optimal transport (POT) [5, 13], which was originally used to analyze partial differential equations.