Unsupervised or Indirectly Supervised Learning
Reviews: Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
This work proposes to use semi-supervised learning, in the form of an unsupervised loss term, for improving the regularization capacity of CNNs. The idea (and the proposed loss) is conceptually simple and enforces stability explicitly by minimizing the difference between predictions corresponding to the same input data point. The paper focuses mainly on the experimental side, devoting the largest part in presenting results when adding the new loss on standard supervised CNNs. This is the stronger aspect of this work, with the weaker being the lack (or the definition) of baselines and the lack of some form of theoretical justification, derivation or discussion. Novelty/originality: The main contribution is the application of the unsupervised loss term for controlling the stability of the predictions under transformations or stochastic variability.
Reviews: Unsupervised Learning of 3D Structure from Images
The paper is well written, clear and contains appropriate references. Also the number of figures help understand and illustrate the text appropriately. The technical content of the paper seems sound, but its ideas are not particularly novel. The proposed model is an extension of DRAW, which now includes a context (which is observed and used as an additional input to the recognition and generation networks) and generates 3D representations instead of 2D representations. This requires the addition of a projection step to 2D at the end of the generation process when the observations are 2D.
Reviews: Graphical Generative Adversarial Networks
This paper proposes Graphical-GAN, a variant of GAN that combines the expressivity of Graphical Models (in particular, Bayesian nets) with the generative inductive bias of Generative Adversarial Networks. For highly structured latent variables, such as the ones considered in this work, the discriminator's task of distinguishing X,Z samples from the two distributions can be different. As a second major contribution, the work proposes a learning procedure inspired by Expectation Propogation (EP). Here, the factorization structure of the graphical model is explicitly exploited to make the task of the discriminator "easier" by comparing only subsets of variables. Finally, the authors perform experiments for controlled generation using a GAN model with a mixture of Gaussians prior, and a State-Space structure to empirically validate their approach.
Realistic Evaluation of Deep Semi-Supervised Learning Algorithms
Semi-supervised learning (SSL) provides a powerful framework for leveraging unlabeled data when labels are limited or expensive to obtain. SSL algorithms based on deep neural networks have recently proven successful on standard benchmark tasks. However, we argue that these benchmarks fail to address many issues that SSL algorithms would face in real-world applications. After creating a unified reimplementation of various widely-used SSL techniques, we test them in a suite of experiments designed to address these issues. We find that the performance of simple baselines which do not use unlabeled data is often underreported, SSL methods differ in sensitivity to the amount of labeled and unlabeled data, and performance can degrade substantially when the unlabeled dataset contains out-of-distribution examples.
Diverse Shape Completion via Style Modulated Generative Adversarial Networks
Shape completion aims to recover the full 3D geometry of an object from a partial observation. This problem is inherently multi-modal since there can be many ways to plausibly complete the missing regions of a shape. Such diversity would be indicative of the underlying uncertainty of the shape and could be preferable for downstream tasks such as planning. In this paper, we propose a novel conditional generative adversarial network that can produce many diverse plausible completions of a partially observed point cloud. To enable our network to produce multiple completions for the same partial input, we introduce stochasticity into our network via style modulation.
Granger Components Analysis: Unsupervised learning of latent temporal dependencies
A new technique for unsupervised learning of time series data based on the notion of Granger causality is presented. The technique learns pairs of projections of a multivariate data set such that the resulting components -- "driving" and "driven" -- maximize the strength of the Granger causality between the latent time series (how strongly the past of the driving signal predicts the present of the driven signal). A coordinate descent algorithm that learns pairs of coefficient vectors in an alternating fashion is developed and shown to blindly identify the underlying sources (up to scale) on simulated vector autoregressive (VAR) data. The technique is tested on scalp electroencephalography (EEG) data from a motor imagery experiment where the resulting components lateralize with the side of the cued hand, and also on functional magnetic resonance imaging (fMRI) data, where the recovered components express previously reported resting-state networks.
Optimal Block-wise Asymmetric Graph Construction for Graph-based Semi-supervised Learning
Graph-based semi-supervised learning (GSSL) serves as a powerful tool to model the underlying manifold structures of samples in high-dimensional spaces. It involves two phases: constructing an affinity graph from available data and inferring labels for unlabeled nodes on this graph. While numerous algorithms have been developed for label inference, the crucial graph construction phase has received comparatively less attention, despite its significant influence on the subsequent phase. In this paper, we present an optimal asymmetric graph structure for the label inference phase with theoretical motivations. Unlike existing graph construction methods, we differentiate the distinct roles that labeled nodes and unlabeled nodes could play.
Neural Modulation for Flash Memory: An Unsupervised Learning Framework for Improved Reliability
Recent years have witnessed a significant increase in the storage density of NAND flash memory, making it a critical component in modern electronic devices. However, with the rise in storage capacity comes an increased likelihood of errors in data storage and retrieval. The growing number of errors poses ongoing challenges for system designers and engineers, in terms of the characterization, modeling, and optimization of NAND-based systems. We present a novel approach for modeling and preventing errors by utilizing the capabilities of generative and unsupervised machine learning methods. As part of our research, we constructed and trained a neural modulator that translates information bits into programming operations on each memory cell in NAND devices.
Beyond Myopia: Learning from Positive and Unlabeled Data through Holistic Predictive Trends
Learning binary classifiers from positive and unlabeled data (PUL) is vital in many real-world applications, especially when verifying negative examples is difficult. Despite the impressive empirical performance of recent PUL methods, challenges like accumulated errors and increased estimation bias persist due to the absence of negative labels. In this paper, we unveil an intriguing yet long-overlooked observation in PUL: \textit{resampling the positive data in each training iteration to ensure a balanced distribution between positive and unlabeled examples results in strong early-stage performance. Furthermore, predictive trends for positive and negative classes display distinctly different patterns.} Specifically, the scores (output probability) of unlabeled negative examples consistently decrease, while those of unlabeled positive examples show largely chaotic trends. Instead of focusing on classification within individual time frames, we innovatively adopt a holistic approach, interpreting the scores of each example as a temporal point process (TPP).
Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning
Fine-tuning vision-language models (VLMs) like CLIP to downstream tasks is often necessary to optimize their performance. However, a major obstacle is the limited availability of labeled data. We study the use of pseudolabels, i.e., heuristic labels for unlabeled data, to enhance CLIP via prompt tuning. Conventional pseudolabeling trains a model on labeled data and then generates labels for unlabeled data. VLMs' zero-shot capabilities enable a second generation'' of pseudolabeling approaches that do not require task-specific training on labeled data.