AITopics | Unsupervised or Indirectly Supervised Learning

Collaborating Authors

Unsupervised or Indirectly Supervised Learning

Unsupervised learning is a branch of machine learning that learns from test data that has not been labeled, classified or categorized. Instead of responding to feedback, unsupervised learning identifies commonalities in the data and reacts based on the presence or absence of such commonalities in each new piece of data. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Supervision by Denoising for Medical Image Segmentation

Young, Sean I., Dalca, Adrian V., Ferrante, Enzo, Golland, Polina, Metzler, Christopher A., Fischl, Bruce, Iglesias, Juan Eugenio

arXiv.org Artificial IntelligenceJan-4-2024

Abstract--Learning-based image reconstruction models, such as those based on the U-Net, require a large set of labeled images if good generalization is to be guaranteed. In some imaging domains, however, labeled data with pixel-or voxel-level label accuracy are scarce due to the cost of acquiring them. This problem is exacerbated further in domains like medical imaging, where there is no single ground truth label, resulting in large amounts of repeat variability in the labels. Therefore, training reconstruction networks to generalize better by learning from both labeled and unlabeled examples (called semi-supervised learning) is problem of practical and theoretical interest. However, traditional semi-supervised learning methods for image reconstruction often necessitate handcrafting a differentiable regularizer specific to some given imaging problem, which can be extremely time-consuming. In this work, we propose "supervision by denoising" (SUD), a framework to supervise reconstruction models using their own denoised output as labels. SUD unifies stochastic averaging and spatial denoising techniques under a spatio-temporal denoising framework and alternates denoising and model weight update steps in an optimization framework for semi-supervision. As example applications, we apply SUD to two problems from biomedical imaging--anatomical brain reconstruction (3D) and cortical parcellation (2D)--to demonstrate a significant improvement in reconstruction over supervised-only and ensembling baselines. While reconstruction models such as those based on the reconstruction network has proved extremely useful for U-Net [5] typically outperform handcrafted models in many imposing topological or spatial priors on the reconstruction imaging problems, they can involve millions of parameters [18], [19] and semi-supervised learning (SSL). SSL methods and, as a result, have a tendency to overfit training data and based on regularization suffer neither from limited diversity generalize poorly to previously unseen images at test time-- of augmented data nor domain gaps resulting from training a problem also exacerbated by distribution shift [6].

denoiser, reconstruction, segmentation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TPAMI.2023.3299789

2202.02952

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(10 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Distance Guided Generative Adversarial Network for Explainable Binary Classifications

Xiong, Xiangyu, Sun, Yue, Liu, Xiaohong, Ke, Wei, Lam, Chan-Tong, Chen, Jiangang, Jiang, Mingfeng, Wang, Mingwei, Xie, Hui, Tong, Tong, Gao, Qinquan, Chen, Hao, Tan, Tao

arXiv.org Artificial IntelligenceDec-29-2023

Despite the potential benefits of data augmentation for mitigating the data insufficiency, traditional augmentation methods primarily rely on the prior intra-domain knowledge. On the other hand, advanced generative adversarial networks (GANs) generate inter-domain samples with limited variety. These previous methods make limited contributions to describing the decision boundaries for binary classification. In this paper, we propose a distance guided GAN (DisGAN) which controls the variation degrees of generated samples in the hyperplane space. Specifically, we instantiate the idea of DisGAN by combining two ways. The first way is vertical distance GAN (VerDisGAN) where the inter-domain generation is conditioned on the vertical distances. The second way is horizontal distance GAN (HorDisGAN) where the intra-domain generation is conditioned on the horizontal distances. Furthermore, VerDisGAN can produce the class-specific regions by mapping the source images to the hyperplane. Experimental results show that DisGAN consistently outperforms the GAN-based augmentation methods with explainable binary classification. The proposed method can apply to different classification architectures and has potential to extend to multi-class classification.

classification, classifier, hyperplane, (15 more...)

arXiv.org Artificial Intelligence

2312.17538

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > China > Shanghai > Shanghai (0.05)
Asia > Macao (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area (0.96)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Robust Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models

Tian, Ye, Weng, Haolei, Feng, Yang

arXiv.org Machine LearningDec-28-2023

Unsupervised learning has been widely used in many real-world applications. One of the simplest and most important unsupervised learning models is the Gaussian mixture model (GMM). In this work, we study the multi-task learning problem on GMMs, which aims to leverage potentially similar GMM parameter structures among tasks to obtain improved learning performance compared to single-task learning. We propose a multi-task GMM learning procedure based on the EM algorithm that not only can effectively utilize unknown similarity between related tasks but is also robust against a fraction of outlier tasks from arbitrary distributions. The proposed procedure is shown to achieve minimax optimal rate of convergence for both parameter estimation error and the excess mis-clustering error, in a wide range of regimes. Moreover, we generalize our approach to tackle the problem of transfer learning for GMMs, where similar theoretical results are derived. Finally, we demonstrate the effectiveness of our methods through simulations and real data examples. To the best of our knowledge, this is the first work studying multi-task and transfer learning on GMMs with theoretical guarantees.

algorithm, estimation error, outlier task, (14 more...)

arXiv.org Machine Learning

2209.15224

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
North America > United States > Michigan (0.04)
(3 more...)

Genre: Research Report > New Finding (0.45)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Add feedback

PG-LBO: Enhancing High-Dimensional Bayesian Optimization with Pseudo-Label and Gaussian Process Guidance

Chen, Taicai, Duan, Yue, Li, Dong, Qi, Lei, Shi, Yinghuan, Gao, Yang

arXiv.org Artificial IntelligenceDec-28-2023

Variational Autoencoder based Bayesian Optimization (VAE-BO) has demonstrated its excellent performance in addressing high-dimensional structured optimization problems. However, current mainstream methods overlook the potential of utilizing a pool of unlabeled data to construct the latent space, while only concentrating on designing sophisticated models to leverage the labeled data. Despite their effective usage of labeled data, these methods often require extra network structures, additional procedure, resulting in computational inefficiency. To address this issue, we propose a novel method to effectively utilize unlabeled data with the guidance of labeled data. Specifically, we tailor the pseudo-labeling technique from semi-supervised learning to explicitly reveal the relative magnitudes of optimization objective values hidden within the unlabeled data. Based on this technique, we assign appropriate training weights to unlabeled data to enhance the construction of a discriminative latent space. Furthermore, we treat the VAE encoder and the Gaussian Process (GP) in Bayesian optimization as a unified deep kernel learning process, allowing the direct utilization of labeled data, which we term as Gaussian Process guidance. This directly and effectively integrates the goal of improving GP accuracy into the VAE training, thereby guiding the construction of the latent space. The extensive experiments demonstrate that our proposed method outperforms existing VAE-BO algorithms in various optimization scenarios. Our code will be published at https://github.com/TaicaiChen/PG-LBO.

latent space, pg-lbo, unlabeled data, (13 more...)

arXiv.org Artificial Intelligence

2312.16983

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

FlexSSL : A Generic and Efficient Framework for Semi-Supervised Learning

Qin, Huiling, Zhan, Xianyuan, Li, Yuanxun, Zheng, Yu

arXiv.org Artificial IntelligenceDec-28-2023

Semi-supervised learning holds great promise for many real-world applications, due to its ability to leverage both unlabeled and expensive labeled data. However, most semi-supervised learning algorithms still heavily rely on the limited labeled data to infer and utilize the hidden information from unlabeled data. We note that any semi-supervised learning task under the self-training paradigm also hides an auxiliary task of discriminating label observability. Jointly solving these two tasks allows full utilization of information from both labeled and unlabeled data, thus alleviating the problem of over-reliance on labeled data. This naturally leads to a new generic and efficient learning framework without the reliance on any domain-specific information, which we call FlexSSL. The key idea of FlexSSL is to construct a semi-cooperative "game", which forges cooperation between a main self-interested semi-supervised learning task and a companion task that infers label observability to facilitate main task training. We show with theoretical derivation of its connection to loss re-weighting on noisy labels. Through evaluations on a diverse range of tasks, we demonstrate that FlexSSL can consistently enhance the performance of semi-supervised learning algorithms.

classification, flexssl, information, (15 more...)

arXiv.org Artificial Intelligence

2312.16892

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > France (0.04)
Asia > China > Beijing > Beijing (0.04)
(18 more...)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Soft Contrastive Learning for Time Series

Lee, Seunghan, Park, Taeyoung, Lee, Kibok

arXiv.org Machine LearningDec-27-2023

Contrastive learning has shown to be effective to learn representations from time series in a self-supervised way. However, contrasting similar time series instances or values from adjacent timestamps within a time series leads to ignore their inherent correlations, which results in deteriorating the quality of learned representations. To address this issue, we propose SoftCLT, a simple yet effective soft contrastive learning strategy for time series. This is achieved by introducing instance-wise and temporal contrastive loss with soft assignments ranging from zero to one. Specifically, we define soft assignments for 1) instance-wise contrastive loss by the distance between time series on the data space, and 2) temporal contrastive loss by the difference of timestamps. SoftCLT is a plug-and-play method for time series contrastive learning that improves the quality of learned representations without bells and whistles. In experiments, we demonstrate that SoftCLT consistently improves the performance in various downstream tasks including classification, semi-supervised learning, transfer learning, and anomaly detection, showing stateof-the-art performance. Code is available at this repository: https://github. Time series (TS) data are ubiquitous in many fields, including finance, energy, healthcare, and transportation (Ding et al., 2020; Lago et al., 2018; Solares et al., 2020; Cai et al., 2020). However, annotating TS data can be challenging as it often requires significant domain expertise and time. To overcome the limitation and utilize unlabeled data without annotations, self-supervised learning has emerged as a promising representation learning approach not only in natural language processing (Devlin et al., 2018; Gao et al., 2021) and computer vision (Chen et al., 2020; Dosovitskiy et al., 2021), but also in TS analysis (Franceschi et al., 2019; Yue et al., 2022). In particular, contrastive learning (CL) has demonstrated remarkable performance across different domains (Chen et al., 2020; Gao et al., 2021; Yue et al., 2022). As it is challenging to determine similarities of instances in self-supervised learning, recent CL works apply data augmentation to generate two views per data and take views from the same instance as positive pairs and the others as negatives (Chen et al., 2020).

contrastive loss, dataset, representation, (16 more...)

arXiv.org Machine Learning

2312.16424

Country:

Europe > Spain > Castile and León > Burgos Province > Burgos (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Joint empirical risk minimization for instance-dependent positive-unlabeled data

Rejchel, Wojciech, Teisseyre, Paweł, Mielniczuk, Jan

arXiv.org Machine LearningDec-27-2023

Learning from positive and unlabeled data (PU learning) is actively researched machine learning task. The goal is to train a binary classification model based on a training dataset containing part of positives which are labeled, and unlabeled instances. Unlabeled set includes remaining part of positives and all negative observations. An important element in PU learning is modeling of the labeling mechanism, i.e. labels' assignment to positive observations. Unlike in many prior works, we consider a realistic setting for which probability of label assignment, i.e. propensity score, is instance-dependent. In our approach we investigate minimizer of an empirical counterpart of a joint risk which depends on both posterior probability of inclusion in a positive class as well as on a propensity score. The non-convex empirical risk is alternately optimised with respect to parameters of both functions. In the theoretical analysis we establish risk consistency of the minimisers using recently derived methods from the theory of empirical processes. Besides, the important development here is a proposed novel implementation of an optimisation algorithm, for which sequential approximation of a set of positive observations among unlabeled ones is crucial. This relies on modified technique of 'spies' as well as on a thresholding rule based on conditional probabilities. Experiments conducted on 20 data sets for various labeling scenarios show that the proposed method works on par or more effectively than state-of-the-art methods based on propensity function estimation.

assumption, dataset, probability, (15 more...)

arXiv.org Machine Learning

2312.16557

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Poland > Kuyavian-Pomeranian Province > Toruń (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Twice Class Bias Correction for Imbalanced Semi-Supervised Learning

Li, Lan, Tao, Bowen, Han, Lu, Zhan, De-chuan, Ye, Han-jia

arXiv.org Artificial IntelligenceDec-27-2023

Differing from traditional semi-supervised learning, class-imbalanced semi-supervised learning presents two distinct challenges: (1) The imbalanced distribution of training samples leads to model bias towards certain classes, and (2) the distribution of unlabeled samples is unknown and potentially distinct from that of labeled samples, which further contributes to class bias in the pseudo-labels during training. To address these dual challenges, we introduce a novel approach called \textbf{T}wice \textbf{C}lass \textbf{B}ias \textbf{C}orrection (\textbf{TCBC}). We begin by utilizing an estimate of the class distribution from the participating training samples to correct the model, enabling it to learn the posterior probabilities of samples under a class-balanced prior. This correction serves to alleviate the inherent class bias of the model. Building upon this foundation, we further estimate the class bias of the current model parameters during the training process. We apply a secondary correction to the model's pseudo-labels for unlabeled samples, aiming to make the assignment of pseudo-labels across different classes of unlabeled samples as equitable as possible. Through extensive experimentation on CIFAR10/100-LT, STL10-LT, and the sizable long-tailed dataset SUN397, we provide conclusive evidence that our proposed TCBC method reliably enhances the performance of class-imbalanced semi-supervised learning.

class distribution, unlabeled data, unlabeled sample, (13 more...)

arXiv.org Artificial Intelligence

2312.16604

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Transfer and Alignment Network for Generalized Category Discovery

An, Wenbin, Tian, Feng, Shi, Wenkai, Chen, Yan, Wu, Yaqiang, Wang, Qianying, Chen, Ping

arXiv.org Artificial IntelligenceDec-27-2023

Generalized Category Discovery is a crucial real-world task. Despite the improved performance on known categories, current methods perform poorly on novel categories. We attribute the poor performance to two reasons: biased knowledge transfer between labeled and unlabeled data and noisy representation learning on the unlabeled data. To mitigate these two issues, we propose a Transfer and Alignment Network (TAN), which incorporates two knowledge transfer mechanisms to calibrate the biased knowledge and two feature alignment mechanisms to learn discriminative features. Specifically, we model different categories with prototypes and transfer the prototypes in labeled data to correct model bias towards known categories. On the one hand, we pull instances with known categories in unlabeled data closer to these prototypes to form more compact clusters and avoid boundary overlap between known and novel categories. On the other hand, we use these prototypes to calibrate noisy prototypes estimated from unlabeled data based on category similarities, which allows for more accurate estimation of prototypes for novel categories that can be used as reliable learning targets later. After knowledge transfer, we further propose two feature alignment mechanisms to acquire both instance- and category-level knowledge from unlabeled data by aligning instance features with both augmented features and the calibrated prototypes, which can boost model performance on both known and novel categories with less noise. Experiments on three benchmark datasets show that our model outperforms SOTA methods, especially on novel categories. Theoretical analysis is provided for an in-depth understanding of our model in general. Our code and data are available at https://github.com/Lackel/TAN.

category, novel category, prototype, (14 more...)

arXiv.org Artificial Intelligence

2312.16467

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Alameda County > Oakland (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)

Add feedback

Dynamic Sub-graph Distillation for Robust Semi-supervised Continual Learning

Fan, Yan, Wang, Yu, Zhu, Pengfei, Hu, Qinghua

arXiv.org Artificial IntelligenceDec-26-2023

Continual learning (CL) has shown promising results and comparable performance to learning at once in a fully supervised manner. However, CL strategies typically require a large number of labeled samples, making their real-life deployment challenging. In this work, we focus on semi-supervised continual learning (SSCL), where the model progressively learns from partially labeled data with unknown categories. We provide a comprehensive analysis of SSCL and demonstrate that unreliable distributions of unlabeled data lead to unstable training and refinement of the progressing stages. This problem severely impacts the performance of SSCL. To address the limitations, we propose a novel approach called Dynamic Sub-Graph Distillation (DSGD) for semi-supervised continual learning, which leverages both semantic and structural information to achieve more stable knowledge distillation on unlabeled data and exhibit robustness against distribution bias. Firstly, we formalize a general model of structural distillation and design a dynamic graph construction for the continual learning progress. Next, we define a structure distillation vector and design a dynamic sub-graph distillation algorithm, which enables end-to-end training and adaptability to scale up tasks. The entire proposed method is adaptable to various CL methods and supervision settings. Finally, experiments conducted on three datasets CIFAR10, CIFAR100, and ImageNet-100, with varying supervision ratios, demonstrate the effectiveness of our proposed approach in mitigating the catastrophic forgetting problem in semi-supervised continual learning scenarios.

continual learning, learning, unlabeled data, (11 more...)

arXiv.org Artificial Intelligence

2312.16409

Country: Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback