Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning


Scalable Prompt Generation for Semi-supervised Learning with Language Models

arXiv.org Artificial Intelligence

Prompt-based learning methods in semi-supervised learning (SSL) settings have been shown to be effective on multiple natural language understanding (NLU) datasets and tasks in the literature. However, manually designing multiple prompts and verbalizers requires domain knowledge and human effort, making it difficult and expensive to scale across different datasets. In this paper, we propose two methods to automatically design multiple prompts and integrate automatic verbalizer in SSL settings without sacrificing performance. The first method uses various demonstration examples with learnable continuous prompt tokens to create diverse prompt models. The second method uses a varying number of soft prompt tokens to encourage language models to learn different prompts. For the verbalizer, we use the prototypical verbalizer to replace the manual one. In summary, we obtained the best average accuracy of 73.2% (a relative improvement of 2.52% over even the previous state-of-the-art SSL method with manual prompts and verbalizers) in different few-shot learning settings.


Are labels informative in semi-supervised learning? -- Estimating and leveraging the missing-data mechanism

arXiv.org Machine Learning

Semi-supervised learning is a powerful technique for leveraging unlabeled data to improve machine learning models, but it can be affected by the presence of ``informative'' labels, which occur when some classes are more likely to be labeled than others. In the missing data literature, such labels are called missing not at random. In this paper, we propose a novel approach to address this issue by estimating the missing-data mechanism and using inverse propensity weighting to debias any SSL algorithm, including those using data augmentation. We also propose a likelihood ratio test to assess whether or not labels are indeed informative. Finally, we demonstrate the performance of the proposed methods on different datasets, in particular on two medical datasets for which we design pseudo-realistic missing data scenarios.


AI/ML Algorithms and Applications in VLSI Design and Technology

arXiv.org Artificial Intelligence

An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations.


A Prototype-Oriented Clustering for Domain Shift with Source Privacy

arXiv.org Artificial Intelligence

Unsupervised clustering under domain shift (UCDS) studies how to transfer the knowledge from abundant unlabeled data from multiple source domains to learn the representation of the unlabeled data in a target domain. In this paper, we introduce Prototype-oriented Clustering with Distillation (PCD) to not only improve the performance and applicability of existing methods for UCDS, but also address the concerns on protecting the privacy of both the data and model of the source domains. PCD first constructs a source clustering model by aligning the distributions of prototypes and data. It then distills the knowledge to the target model through cluster labels provided by the source model while simultaneously clustering the target data. Finally, it refines the target model on the target domain data without guidance from the source model. Experiments across multiple benchmarks show the effectiveness and generalizability of our source-private clustering method.


QS-ADN: Quasi-Supervised Artifact Disentanglement Network for Low-Dose CT Image Denoising by Local Similarity Among Unpaired Data

arXiv.org Artificial Intelligence

Deep learning has been successfully applied to low-dose CT (LDCT) image denoising for reducing potential radiation risk. However, the widely reported supervised LDCT denoising networks require a training set of paired images, which is expensive to obtain and cannot be perfectly simulated. Unsupervised learning utilizes unpaired data and is highly desirable for LDCT denoising. As an example, an artifact disentanglement network (ADN) relies on unparied images and obviates the need for supervision but the results of artifact reduction are not as good as those through supervised learning.An important observation is that there is often hidden similarity among unpaired data that can be utilized. This paper introduces a new learning mode, called quasi-supervised learning, to empower the ADN for LDCT image denoising.For every LDCT image, the best matched image is first found from an unpaired normal-dose CT (NDCT) dataset. Then, the matched pairs and the corresponding matching degree as prior information are used to construct and train our ADN-type network for LDCT denoising.The proposed method is different from (but compatible with) supervised and semi-supervised learning modes and can be easily implemented by modifying existing networks. The experimental results show that the method is competitive with state-of-the-art methods in terms of noise suppression and contextual fidelity. The code and working dataset are publicly available at https://github.com/ruanyuhui/ADN-QSDL.git.


Optimal Transport Guided Unsupervised Learning for Enhancing low-quality Retinal Images

arXiv.org Machine Learning

Real-world non-mydriatic retinal fundus photography is prone to artifacts, imperfections and low-quality when certain ocular or systemic co-morbidities exist. Artifacts may result in inaccuracy or ambiguity in clinical diagnoses. In this paper, we proposed a simple but effective end-to-end framework for enhancing poor-quality retinal fundus images. Leveraging the optimal transport theory, we proposed an unpaired image-to-image translation scheme for transporting low-quality images to their high-quality counterparts. We theoretically proved that a Generative Adversarial Networks (GAN) model with a generator and discriminator is sufficient for this task. Furthermore, to mitigate the inconsistency of information between the low-quality images and their enhancements, an information consistency mechanism was proposed to maximally maintain structural consistency (optical discs, blood vessels, lesions) between the source and enhanced domains. Extensive experiments were conducted on the EyeQ dataset to demonstrate the superiority of our proposed method perceptually and quantitatively.


PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs

arXiv.org Artificial Intelligence

Recent advances in the understanding of Generative Adversarial Networks (GANs) have led to remarkable progress in visual editing and synthesis tasks, capitalizing on the rich semantics that are embedded in the latent spaces of pre-trained GANs. However, existing methods are often tailored to specific GAN architectures and are limited to either discovering global semantic directions that do not facilitate localized control, or require some form of supervision through manually provided regions or segmentation masks. In this light, we present an architecture-agnostic approach that jointly discovers factors representing spatial parts and their appearances in an entirely unsupervised fashion. These factors are obtained by applying a semi-nonnegative tensor factorization on the feature maps, which in turn enables context-aware local image editing with pixel-level control. In addition, we show that the discovered appearance factors correspond to saliency maps that localize concepts of interest, without using any labels. Experiments on a wide range of GAN architectures and datasets show that, in comparison to the state of the art, our method is far more efficient in terms of training time and, most importantly, provides much more accurate localized control. Our code is available at: https://github.com/james-oldfield/PandA.


Unsupervised Machine Learning. Unsupervised machine learning is a typeโ€ฆ

#artificialintelligence

Unsupervised machine learning is a type of machine learning where the model is trained on a dataset without any labeled output. The goal of unsupervised learning is to uncover hidden patterns or relationships in the data. Unsupervised learning is useful when labeled data is not available or when the goal is to discover new relationships in the data. However, it can be more challenging to evaluate the results of unsupervised learning compared to supervised learning, as there is no clear metric to assess the performance of the model. In conclusion, unsupervised learning is a powerful tool for understanding and extracting information from complex and unlabeled data.


Unsupervised Learning, Recommenders, Reinforcement Learning

#artificialintelligence

The Machine Learning Specialization is a foundational online program created in collaboration between DeepLearning.AI and Stanford Online. In this beginner-friendly program, you will learn the fundamentals of machine learning and how to use these techniques to build real-world AI applications. This Specialization is taught by Andrew Ng, an AI visionary who has led critical research at Stanford University and groundbreaking work at Google Brain, Baidu, and Landing.AI to advance the AI field. This 3-course Specialization is an updated and expanded version of Andrew's pioneering Machine Learning course, rated 4.9 out of 5 and taken by over 4.8 million learners since it launched in 2012. It provides a broad introduction to modern machine learning, including supervised learning (multiple linear regression, logistic regression, neural networks, and decision trees), unsupervised learning (clustering, dimensionality reduction, recommender systems), and some of the best practices used in Silicon Valley for artificial intelligence and machine learning innovation (evaluating and tuning models, taking a data-centric approach to improving performance, and more.)


Unsupervised Learning for Pilot-free Transmission in 3GPP MIMO Systems

arXiv.org Artificial Intelligence

Reference signals overhead reduction has recently evolved as an effective solution for improving the system spectral efficiency. This paper introduces a new downlink data structure that is free from demodulation reference signals (DM-RS), and hence does not require any channel estimation at the receiver. The new proposed data transmission structure involves a simple repetition step of part of the user data across the different sub-bands. Exploiting the repetition structure at the user side, it is shown that reliable recovery is possible via canonical correlation analysis. This paper also proposes two effective mechanisms for boosting the CCA performance in OFDM systems; one for repetition pattern selection and another to deal with the severe frequency selectivity issues. The proposed approach exhibits favorable complexity-performance tradeoff, rendering it appealing for practical implementation. Numerical results, using a 3GPP link-level testbench, demonstrate the superiority of the proposed approach relative to the state-of-the-art methods.