AITopics | Unsupervised or Indirectly Supervised Learning

Collaborating Authors

Unsupervised or Indirectly Supervised Learning

Unsupervised learning is a branch of machine learning that learns from test data that has not been labeled, classified or categorized. Instead of responding to feedback, unsupervised learning identifies commonalities in the data and reacts based on the presence or absence of such commonalities in each new piece of data. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Quantum Generative Adversarial Networks: Bridging Classical and Quantum Realms

Nokhwal, Sahil, Nokhwal, Suman, Pahune, Saurabh, Chaudhary, Ankit

arXiv.org Artificial IntelligenceDec-26-2023

In this pioneering research paper, we present a groundbreaking exploration into the synergistic fusion of classical and quantum computing paradigms within the realm of Generative Adversarial Networks (GANs). Our objective is to seamlessly integrate quantum computational elements into the conventional GAN architecture, thereby unlocking novel pathways for enhanced training processes. Drawing inspiration from the inherent capabilities of quantum bits (qubits), we delve into the incorporation of quantum data representation methodologies within the GAN framework. By capitalizing on the unique quantum features, we aim to accelerate the training process of GANs, offering a fresh perspective on the optimization of generative models. Our investigation deals with theoretical considerations and evaluates the potential quantum advantages that may manifest in terms of training efficiency and generative quality. We confront the challenges inherent in the quantum-classical amalgamation, addressing issues related to quantum hardware constraints, error correction mechanisms, and scalability considerations. This research is positioned at the forefront of quantum-enhanced machine learning, presenting a critical stride towards harnessing the computational power of quantum systems to expedite the training of Generative Adversarial Networks. Through our comprehensive examination of the interface between classical and quantum realms, we aim to uncover transformative insights that will propel the field forward, fostering innovation and advancing the frontier of quantum machine learning.

generative adversarial network, generative model, quantum generative adversarial network, (12 more...)

arXiv.org Artificial Intelligence

2312.09939

Country:

North America > United States > Tennessee > Shelby County > Memphis (0.04)
North America > United States > Ohio > Franklin County > Dublin (0.04)
North America > United States > California > Alameda County > Pleasanton (0.04)
(2 more...)

Genre:

Overview (0.69)
Research Report > Promising Solution (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.98)

Add feedback

Unsupervised Learning of Phylogenetic Trees via Split-Weight Embedding

Kong, Yibo, Tiley, George P., Solis-Lemus, Claudia

arXiv.org Machine LearningDec-26-2023

The Tree of Life is a massive graphical structure which represents the evolutionary process from single cell organisms into the immense biodiversity of living species in present time. Estimating the Tree of Life would not only represent the greatest accomplishment in evolutionary biology and systematics, but it would also allow us to fully understand the development and evolution of important biological traits in nature, in particular, those related to resilience to extinction when exposed to environmental threats such as climate change. Therefore, the development of statistical and machine-learning theory to reconstruct the Tree of Life, especially those scalable to big data, are paramount in evolutionary biology, systematics, and conservation efforts against mass extinctions. Graphical structures that represent evolutionary processes are denoted phylogenetic trees. A phylogenetic tree is a binary tree whose internal nodes represent ancestral species that over time differentiate into two separate species giving rise to its two children nodes (see Figure 1 left). The evolutionary process is then depicted by this bifurcating tree from the root (the origin of life) to the external nodes of the tree (also denoted leaves) which represent the living organisms today.

artificial intelligence, machine learning, phylogenetic tree, (18 more...)

arXiv.org Machine Learning

2312.16074

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Africa > Madagascar (0.05)
Oceania > Australia (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.66)

Add feedback

Benefit from public unlabeled data: A Frangi filtering-based pretraining network for 3D cerebrovascular segmentation

Shi, Gen, Lu, Hao, Hui, Hui, Tian, Jie

arXiv.org Artificial IntelligenceDec-23-2023

The precise cerebrovascular segmentation in time-of-flight magnetic resonance angiography (TOF-MRA) data is crucial for clinically computer-aided diagnosis. However, the sparse distribution of cerebrovascular structures in TOF-MRA results in an exceedingly high cost for manual data labeling. The use of unlabeled TOF-MRA data holds the potential to enhance model performance significantly. In this study, we construct the largest preprocessed unlabeled TOF-MRA datasets (1510 subjects) to date. We also provide three additional labeled datasets totaling 113 subjects. Furthermore, we propose a simple yet effective pertraining strategy based on Frangi filtering, known for enhancing vessel-like structures, to fully leverage the unlabeled data for 3D cerebrovascular segmentation. Specifically, we develop a Frangi filtering-based preprocessing workflow to handle the large-scale unlabeled dataset, and a multi-task pretraining strategy is proposed to effectively utilize the preprocessed data. By employing this approach, we maximize the knowledge gained from the unlabeled data. The pretrained model is evaluated on four cerebrovascular segmentation datasets. The results have demonstrated the superior performance of our model, with an improvement of approximately 3\% compared to state-of-the-art semi- and self-supervised methods. Furthermore, the ablation studies also demonstrate the generalizability and effectiveness of the pretraining method regarding the backbone structures. The code and data have been open source at: \url{https://github.com/shigen-StoneRoot/FFPN}.

cerebrovascular segmentation, dataset, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2312.15273

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Detecting fake accounts through Generative Adversarial Network in online social media

Bordbar, Jinus, Mohammadrezaie, Mohammadreza, Ardalan, Saman, Shiri, Mohammad Ebrahim

arXiv.org Artificial IntelligenceDec-20-2023

Online social media is integral to human life, facilitating messaging, information sharing, and confidential communication while preserving privacy. Platforms like Twitter, Instagram, and Facebook exemplify this phenomenon. However, users face challenges due to network anomalies, often stemming from malicious activities such as identity theft for financial gain or harm. This paper proposes a novel method using user similarity measures and the Generative Adversarial Network (GAN) algorithm to identify fake user accounts in the Twitter dataset. Despite the problem's complexity, the method achieves an AUC rate of 80\% in classifying and detecting fake accounts. Notably, the study builds on previous research, highlighting advancements and insights into the evolving landscape of anomaly detection in online social networks.

detecting fake account, generative adversarial network, online social media

arXiv.org Artificial Intelligence

2210.15657

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (0.89)
Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Meta Co-Training: Two Views are Better than One

Rothenberger, Jay C., Diochnos, Dimitrios I.

arXiv.org Artificial IntelligenceDec-20-2023

In many practical computer vision scenarios unlabeled data is plentiful, but labels are scarce and difficult to obtain. As a result, semi-supervised learning which leverages unlabeled data to boost the performance of supervised classifiers have received significant attention in recent literature. One major class of semi-supervised algorithms is co-training. In co-training two different models leverage different independent and sufficient "views" of the data to jointly make better predictions. During co-training each model creates pseudo labels on unlabeled points which are used to improve the other model. We show that in the common case when independent views are not available we can construct such views inexpensively using pre-trained models. Co-training on the constructed views yields a performance improvement over any of the individual views we construct and performance comparable with recent approaches in semi-supervised learning, but has some undesirable properties. To alleviate the issues present with co-training we present Meta Co-Training which is an extension of the successful Meta Pseudo Labels approach to two views. Our method achieves new state-of-the-art performance on ImageNet-10% with very few training resources, as well as outperforming prior semi-supervised work on several other fine-grained image classification datasets.

accuracy, dataset, learning, (14 more...)

arXiv.org Artificial Intelligence

2311.18083

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(24 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Fake detection in imbalance dataset by Semi-supervised learning with GAN

Bordbar, Jinus, Ardalan, Saman, Mohammadrezaie, Mohammadreza, Ghasemi, Zahra

arXiv.org Artificial IntelligenceDec-20-2023

As social media continues to grow rapidly, the prevalence of harassment on these platforms has also increased. This has piqued the interest of researchers in the field of fake detection. Social media data, often forms complex graphs with numerous nodes, posing several challenges. These challenges and limitations include dealing with a significant amount of irrelevant features in matrices and addressing issues such as high data dispersion and an imbalanced class distribution within the dataset. To overcome these challenges and limitations, researchers have employed auto-encoders and a combination of semi-supervised learning with a GAN algorithm, referred to as SGAN. Our proposed method utilizes auto-encoders for feature extraction and incorporates SGAN. By leveraging an unlabeled dataset, the unsupervised layer of SGAN compensates for the limited availability of labeled data, making efficient use of the limited number of labeled instances. Multiple evaluation metrics were employed, including the Confusion Matrix and the ROC curve. The dataset was divided into training and testing sets, with 100 labeled samples for training and 1,000 samples for testing. The novelty of our research lies in applying SGAN to address the issue of imbalanced datasets in fake account detection. By optimizing the use of a smaller number of labeled instances and reducing the need for extensive computational power, our method offers a more efficient solution. Additionally, our study contributes to the field by achieving an 81% accuracy in detecting fake accounts using only 100 labeled samples. This demonstrates the potential of SGAN as a powerful tool for handling minority classes and addressing big data challenges in fake account detection.

fake detection, imbalance dataset, semi-supervised, (1 more...)

arXiv.org Artificial Intelligence

2212.01071

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.53)

Add feedback

Roll With the Punches: Expansion and Shrinkage of Soft Label Selection for Semi-supervised Fine-Grained Learning

Duan, Yue, Zhao, Zhen, Qi, Lei, Zhou, Luping, Wang, Lei, Shi, Yinghuan

arXiv.org Artificial IntelligenceDec-19-2023

While semi-supervised learning (SSL) has yielded promising results, the more realistic SSL scenario remains to be explored, in which the unlabeled data exhibits extremely high recognition difficulty, e.g., fine-grained visual classification in the context of SSL (SS-FGVC). The increased recognition difficulty on fine-grained unlabeled data spells disaster for pseudo-labeling accuracy, resulting in poor performance of the SSL model. To tackle this challenge, we propose Soft Label Selection with Confidence-Aware Clustering based on Class Transition Tracking (SoC) by reconstructing the pseudo-label selection process by jointly optimizing Expansion Objective and Shrinkage Objective, which is based on a soft label manner. Respectively, the former objective encourages soft labels to absorb more candidate classes to ensure the attendance of ground-truth class, while the latter encourages soft labels to reject more noisy classes, which is theoretically proved to be equivalent to entropy minimization. In comparisons with various state-of-the-art methods, our approach demonstrates its superior performance in SS-FGVC. Checkpoints and source code are available at https://github.com/NJUyued/SoC4SS-FGVC.

computer vision, soft label, unlabeled data, (15 more...)

arXiv.org Artificial Intelligence

2312.12237

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Oceania > Australia > New South Wales > Wollongong (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.77)

Add feedback

Predicting Financial Literacy via Semi-supervised Learning

Rudd, David Hason, Huo, Huan, Xu, Guandong

arXiv.org Artificial IntelligenceDec-18-2023

Financial literacy (FL) represents a person's ability to turn assets into income, and understanding digital currencies has been added to the modern definition. FL can be predicted by exploiting unlabelled recorded data in financial networks via semi-supervised learning (SSL). Measuring and predicting FL has not been widely studied, resulting in limited understanding of customer financial engagement consequences. Previous studies have shown that low FL increases the risk of social harm. Therefore, it is important to accurately estimate FL to allocate specific intervention programs to less financially literate groups. This will not only increase company profitability, but will also reduce government spending. Some studies considered predicting FL in classification tasks, whereas others developed FL definitions and impacts. The current paper investigated mechanisms to learn customer FL level from their financial data using sampling by synthetic minority over-sampling techniques for regression with Gaussian noise (SMOGN). We propose the SMOGN-COREG model for semi-supervised regression, applying SMOGN to deal with unbalanced datasets and a nonparametric multi-learner co-regression (COREG) algorithm for labeling. We compared the SMOGN-COREG model with six well-known regressors on five datasets to evaluate the proposed models effectiveness on unbalanced and unlabelled financial data. Experimental results confirmed that the proposed method outperformed the comparator models for unbalanced and unlabelled financial data. Therefore, SMOGN-COREG is a step towards using unlabelled data to estimate FL level.

algorithm, dataset, unlabelled data, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-97546-3_25

2312.10984

Country: Oceania > Australia (0.05)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.93)

Industry:

Banking & Finance (1.00)
Consumer Products & Services > Retirement (0.73)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Guided Distillation for Semi-Supervised Instance Segmentation

Berrada, Tariq, Couprie, Camille, Alahari, Karteek, Verbeek, Jakob

arXiv.org Artificial IntelligenceDec-14-2023

Although instance segmentation methods have improved considerably, the dominant paradigm is to rely on fully-annotated training images, which are tedious to obtain. To alleviate this reliance, and boost results, semi-supervised approaches leverage unlabeled data as an additional training signal that limits overfitting to the labeled samples. In this context, we present novel design choices to significantly improve teacher-student distillation models. In particular, we (i) improve the distillation approach by introducing a novel "guided burn-in" stage, and (ii) evaluate different instance segmentation architectures, as well as backbone networks and pre-training strategies. Contrary to previous work which uses only supervised data for the burn-in period of the student model, we also use guidance of the teacher model to exploit unlabeled data in the burn-in period. Our improved distillation approach leads to substantial improvements over previous state-of-the-art results. For example, on the Cityscapes dataset we improve mask-AP from 23.7 to 33.9 when using labels for 10\% of images, and on the COCO dataset we improve mask-AP from 18.3 to 34.1 when using labels for only 1\% of the training data.

architecture, backbone, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2308.02668

Country: Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre: Research Report (0.64)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Controller-Guided Partial Label Consistency Regularization with Unlabeled Data

Wang, Qian-Wei, Zhao, Bowen, Zhu, Mingyan, Li, Tianxiang, Liu, Zimo, Xia, Shu-Tao

arXiv.org Artificial IntelligenceDec-14-2023

Partial label learning (PLL) learns from training examples each associated with multiple candidate labels, among which only one is valid. In recent years, benefiting from the strong capability of dealing with ambiguous supervision and the impetus of modern data augmentation methods, consistency regularization-based PLL methods have achieved a series of successes and become mainstream. However, as the partial annotation becomes insufficient, their performances drop significantly. In this paper, we leverage easily accessible unlabeled examples to facilitate the partial label consistency regularization. In addition to a partial supervised loss, our method performs a controller-guided consistency regularization at both the label-level and representation-level with the help of unlabeled data. To minimize the disadvantages of insufficient capabilities of the initial supervised model, we use the controller to estimate the confidence of each current prediction to guide the subsequent consistency regularization. Furthermore, we dynamically adjust the confidence thresholds so that the number of samples of each class participating in consistency regularization remains roughly equal to alleviate the problem of class-imbalance. Experiments show that our method achieves satisfactory performances in more practical situations, and its modules can be applied to existing PLL methods to enhance their capabilities.

consistency regularization, learning, representation, (11 more...)

arXiv.org Artificial Intelligence

2210.11194

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback