Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning


Review for NeurIPS paper: VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain

Neural Information Processing Systems

Weaknesses: My central concern for this paper is the misalignment between the motivation and methodology. As motivation, the authors argue that self-supervised CV and **NLP** "algorithms are not effective for tabular data." The proposed model, though, is effectively the binary masked language model whose variants pervade self-supervised NLP research (e.g. Granted, instead of masking words, the proposed models are masking tabular values, but this is performing a very similar pretext task. In fact, there is concurrent work that learns tabular representations using a BERT model [1].


VIME: Extending the Success of Self-and Semi-supervised Learning to Tabular Domain

Neural Information Processing Systems

Self-and semi-supervised learning frameworks have made significant progress in training machine learning models with limited labeled data in image and language domains. These methods heavily rely on the unique structure in the domain datasets (such as spatial relationships in images or semantic relationships in language). They are not adaptable to general tabular data which does not have the same explicit structure as image and language data. In this paper, we fill this gap by proposing novel self-and semi-supervised learning frameworks for tabular data, which we refer to collectively as VIME (Value Imputation and Mask Estimation). We create a novel pretext task of estimating mask vectors from corrupted tabular data in addition to the reconstruction pretext task for self-supervised learning. We also introduce a novel tabular data augmentation method for self-and semi-supervised learning frameworks. In experiments, we evaluate the proposed framework in multiple tabular datasets from various application domains, such as genomics and clinical data. VIME exceeds state-of-the-art performance in comparison to the existing baseline methods.


Review for NeurIPS paper: VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain

Neural Information Processing Systems

This paper proposes a new reconstruction loss for unsupervised training of representations. This loss extends auto-encoders via a pretext task that uses the marginal distribution of features. The reviewers were unanimous in their decision to accept this paper.


Unsupervised Learning of Lagrangian Dynamics from Images for Prediction and Control (Supplementary Material)

Neural Information Processing Systems

Here we summarize all our model assumptions and highlight what are learned from data. Model assumptions: The choice of coordinates: we choose which set of coordinate we want to learn and design coordinate-aware VAE accordingly. This is important from an interpretability perspective. Take the Acrobot as an example, the set of generalized coordinates that describes the time evolution of the system is not unique (see Figure 5 in [1] for another choice of generalized coordinates.) Because of this non-uniqueness, without specifying which set of coordinates we want to learn will let the model lose interpretability.



Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift

Neural Information Processing Systems

Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning). However, despite the popularity and compatibility of these techniques, their efficacy in combination remains surprisingly unexplored. In this paper, we first undertake a systematic empirical investigation of this combination, finding (i) that in domain adaptation settings, self-training and contrastive learning offer significant complementary gains; and (ii) that in semi-supervised learning settings, surprisingly, the benefits are not synergistic. Across eight distribution shift datasets (e.g., BREEDs, WILDS), we demonstrate that the combined method obtains 3-8% higher accuracy than either approach independently. Finally, we theoretically analyze these techniques in a simplified model of distribution shift demonstrating scenarios under which the features produced by contrastive learning can yield a good initialization for self-training to further amplify gains and achieve optimal performance, even when either method alone would fail.


Reviews: Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

Neural Information Processing Systems

The paper discusses how to solve semi-supervised learning with multi-layer graphs. For single-layer graphs, this is achieved by label regression regularized by Laplacian matrix. For multi-layer, the paper argues that it should use a power mean Laplacian instead of the plain additive sum of Laplacians in each layer. This generalizes prior work including using the harmonic means. Some theoretical discussions follow under the assumptions from Multilayer Stochastic Block Model (MSBM), showing that specificity and robustness trade-offs can be achieved by adjusting the power parameter.


Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

Neural Information Processing Systems

We study the task of semi-supervised learning on multilayer graphs by taking into account both labeled and unlabeled observations together with the information encoded by each individual graph layer. We propose a regularizer based on the generalized matrix mean, which is a one-parameter family of matrix means that includes the arithmetic, geometric and harmonic means as particular cases. We analyze it in expectation under a Multilayer Stochastic Block Model and verify numerically that it outperforms state of the art methods. Moreover, we introduce a matrix-free numerical scheme based on contour integral quadratures and Krylov subspace solvers that scales to large sparse multilayer graphs.


Reviews: Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

Neural Information Processing Systems

This paper makes a contribution toward the theory of semi-supervised learning for graph classification, as well as an efficient algorithm for computing the proposed classifier. This is an interesting problem and the reviewers agree the contribution is at least incremental. I suggest the authors carefully revise the paper to address reviewer concerns to get the maximum impact.


Review for NeurIPS paper: Unsupervised Semantic Aggregation and Deformable Template Matching for Semi-Supervised Learning

Neural Information Processing Systems

It seems trivial to extend the Triplet Mutual Information [1] and its code [2]. The contribution of the proposed method is not clear. Please explain the difference between your work and [1] about Triplet Mutual Information. For the comparison, how were the parameters of other methods tuned? 4. Deformable template matching is an existing technology. Please explain the difference between your work and [3, 4] separately.