Goto

Collaborating Authors

 Inductive Learning


Reviews: A Flexible Generative Framework for Graph-based Semi-supervised Learning

Neural Information Processing Systems

This work employs techniques developed in network science literature, such as latent space model (LSM) and stochastic block model (SBM), to propose a generative model for features X, outputs Y, and graph G, and it uses graph neural networks to approximate the posterior of missing outputs given X, observed Y, and G. This work is a wise combination of recent methods to effectively address the problem of graph-based semi-supervised learning. However, I have some concerns, which are summarized as follows: - Although the paper proposed a new interesting generative method for graph-based semi-supervised learning, it is not super novel, as it employs the other existing methods as the blocks of their method, like LSM, SBM, GCN, GAT. - It seems the generative model is only generative for G given X and Y and by factorizing the other part as p(Y,X) p(Y X) p(X), for p(Y X), it is modeled via a multi-layer perceptron, which is a discriminative model. That is why the authors discard X in all the analyses, like any other discriminative model, and say that everything is conditioned on X. I think this makes the proposed model not fully generative. It is only generative for G but not for X and Y.


Reviews: A Flexible Generative Framework for Graph-based Semi-supervised Learning

Neural Information Processing Systems

This paper proposes a generative framework for graph-based semi-supervised learning for approximating the joint distribution of the graph structure, labels and the node features. Variational inference techniques are then used to approximate the Bayesian posterior. The paper is well written. There are some issues raised by reviewer 3 regarding a better positioning of GenGNN with respect to GCN/GAT; which are recommended to be taken into account for the final version of the paper.


Review for NeurIPS paper: Learning Sparse Prototypes for Text Generation

Neural Information Processing Systems

Additional Feedback: I enjoyed reading your manuscript, and I find the idea of generating sentences by editing prototypes an exciting direction. In the following I'd like to raise a few points, some are comments, some are clarification questions. 1. If the support of the retrieval component q(t x) is an index set of the training data (as in, a sample t indexes a training instance), you cannot easily change the support at test time, can you? If I am right that you cannot, then you are limited to using the same repository of prototypes during training and test. This in turn means that subsampling the training data, for scalability, also affects the set of prototypes available for test-time generations.


Review for NeurIPS paper: Learning Sparse Prototypes for Text Generation

Neural Information Processing Systems

This paper builds upon Guu et al. (2018)'s prototype-driven text generation approach. Two major changes are made: first, modeling a sparse distribution over prototypes with a Dirichlet prior over a multinomial, and second, actually learning this sparse distribution. At training time, the paper uses amortized variational inference, further approximating the gradients using REINFORCE to deal with the large number of prototypes. At inference time, they can keep fewer training examples in memory by filtering only those whose posterior probability is larger than a threshold. Thus both the memory required to store training examples and the time spent on retrieving training examples is reduced.


Review for NeurIPS paper: Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning

Neural Information Processing Systems

This paper proposes an approach to semi-supervised learning for imbalanced classes. It is indeed non-trivial to combine local/global/perturbation consistency-based semi-supervised methods and fully supervised methods for imbalanced classes---this paper may be the first work along this direction. The paper is quite general and can be applied on top of any pseudo-labeling-based semi-supervised methods. It first estimates the true class-prior probability and then updates/modifies the pseudo labels by pushing their class-prior probability with a constrained convex optimization. While in the beginning the reviewers had some concerns (mainly the clarity and too few datasets), the authors did a particularly good job in their rebuttal (showing that the class-prior probability can be estimated rather than must be given).


Reviews: Consistency-based Semi-supervised Learning for Object detection

Neural Information Processing Systems

The paper presents a semi-supervised approach for object detection that extends the consistency regularization used for image classification [14] for object detection. Concretely, it proposes using consistency losses for both classification and localization, as well as a background elimination technique that alleviates the class imbalance inherent to object detection. They evaluate their approach with two types of detectors (single and two-stage) on PASCAL VOT 2007 with unlabeled data from VOT2012 and COCO. Pros: The approach is novel, as far as I know no previous work addresses semi-supervised learning with consistency regularization for object detection. The use of JS divergence over L2 distance is justified and shown experimentally.


Reviews: Consistency-based Semi-supervised Learning for Object detection

Neural Information Processing Systems

This paper introduces a semi-supervised approach for object detection that extends the consistency regularization used for image classification for object detection. The proposed approach is novel and interesting. The evaluation part can be improved to make the comparison more convincing, as suggested by several reviewers.



CEReBrO: Compact Encoder for Representations of Brain Oscillations Using Efficient Alternating Attention

arXiv.org Artificial Intelligence

Electroencephalograph (EEG) is a crucial tool for studying brain activity. Recently, self-supervised learning methods leveraging large unlabeled datasets have emerged as a potential solution to the scarcity of widely available annotated EEG data. However, current methods suffer from at least one of the following limitations: i) sub-optimal EEG signal modeling, ii) model sizes in the hundreds of millions of trainable parameters, and iii) reliance on private datasets and/or inconsistent public benchmarks, hindering reproducibility. To address these challenges, we introduce a Compact Encoder for Representations of Brain Oscillations using alternating attention (CEReBrO), a new small EEG foundation model. Our tokenization scheme represents EEG signals at a per-channel patch granularity. We propose an alternating attention mechanism that jointly models intra-channel temporal dynamics and inter-channel spatial correlations, achieving 2x speed improvement with 6x less memory required compared to standard self-attention. We present several model sizes ranging from 3.6 million to 85 million parameters. Pre-trained on over 20,000 hours of publicly available scalp EEG recordings with diverse channel configurations, our models set new benchmarks in emotion detection and seizure detection tasks, with competitive performance in anomaly classification and gait prediction. This validates our models' effectiveness and efficiency.


Review for NeurIPS paper: Exemplar Guided Active Learning

Neural Information Processing Systems

The paper proposed to select unlabelled training examples based-on the embedding distance between the given exemplar and the query data. A pretrained BERT model is used to compute the embedding for the training examples. The problem formulation of selecting balanced labels in a highly skewed training set and the complexity bound is appreciated by all the reviewers. The general consensus is that the paper adds an interesting contribution to active learning methods applied to word sense disambiguation. The current version of the paper would be greatly strengthened by including more datasets.