AITopics

2212.1472

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > New South Wales (0.04)
North America > United States > New York > New York County > New York City (0.04)
(9 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Energy (1.00)
Education > Educational Setting > Online (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
(2 more...)

arXiv.org Artificial IntelligenceAug-3-2023

ProMix: Combating Label Noise via Maximizing Clean Sample Utility

Xiao, Ruixuan, Dong, Yiwen, Wang, Haobo, Feng, Lei, Wu, Runze, Chen, Gang, Zhao, Junbo

Learning with Noisy Labels (LNL) has become an appealing topic, as imperfectly annotated data are relatively cheaper to obtain. Recent state-of-the-art approaches employ specific selection mechanisms to separate clean and noisy samples and then apply Semi-Supervised Learning (SSL) techniques for improved performance. However, the selection step mostly provides a medium-sized and decent-enough clean subset, which overlooks a rich set of clean samples. To fulfill this, we propose a novel LNL framework ProMix that attempts to maximize the utility of clean samples for boosted performance. Key to our method, we propose a matched high confidence selection technique that selects those examples with high confidence scores and matched predictions with given labels to dynamically expand a base clean sample set. To overcome the potential side effect of excessive clean set selection procedure, we further devise a novel SSL framework that is able to train balanced and unbiased classifiers on the separated clean and noisy samples. Extensive experiments demonstrate that ProMix significantly advances the current state-of-the-art results on multiple benchmarks with different types and levels of noise. It achieves an average improvement of 2.48\% on the CIFAR-N dataset. The code is available at https://github.com/Justherozen/ProMix

artificial intelligence, machine learning, promix, (16 more...)

2207.10276

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Singapore (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.35)

Javanbakhat, Masoumeh, Lippert, Christoph

A Probabilistic Approach to Self-Supervised Learning using Cyclical Stochastic Gradient MCMC

In this paper we present a practical Bayesian self-supervised learning method with Cyclical Stochastic Gradient Hamiltonian Monte Carlo (cSGHMC). Within this framework, we place a prior over the parameters of a self-supervised learning model and use cSGHMC to approximate the high dimensional and multimodal posterior distribution over the embeddings. By exploring an expressive posterior over the embeddings, Bayesian self-supervised learning produces interpretable and diverse representations. Marginalizing over these representations yields a significant gain in performance, calibration and out-of-distribution detection on a variety of downstream classification tasks. We provide experimental results on multiple classification tasks on four challenging datasets. Moreover, we demonstrate the effectiveness of the proposed method in out-of-distribution detection using the SVHN and CIFAR-10 datasets.

learning, posterior distribution, representation, (15 more...)

2308.01271

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(2 more...)

Barrett, Noah, Sadeghi, Zahra, Matwin, Stan

Evolutionary Augmentation Policy Optimization for Self-supervised Learning

evolutionary augmentation policy optimization, self-supervised learning

Self-supervised Learning (SSL) is a machine learning algorithm for pretraining Deep Neural Networks (DNNs) without requiring manually labeled data. The central idea of this learning technique is based on an auxiliary stage aka pretext task in which labeled data are created automatically through data augmentation and exploited for pretraining the DNN. However, the effect of each pretext task is not well studied or compared in the literature. In this paper, we study the contribution of augmentation operators on the performance of self supervised learning algorithms in a constrained settings. We propose an evolutionary search method for optimization of data augmentation pipeline in pretext tasks and measure the impact of augmentation operators in several SOTA SSL algorithms. By encoding different combination of augmentation operators in chromosomes we seek the optimal augmentation policies through an evolutionary optimization mechanism. We further introduce methods for analyzing and explaining the performance of optimized SSL algorithms. Our results indicate that our proposed method can find solutions that outperform the accuracy of classification of SSL algorithms which confirms the influence of augmentation policy choice on the overall performance of SSL algorithms. We also compare optimal SSL solutions found by our evolutionary search mechanism and show the effect of batch size in the pretext task on two visual datasets.

doi: 10.54364/AAIML.2023.1167

2303.01584

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.80)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Data-Centric Diet: Effective Multi-center Dataset Pruning for Medical Image Segmentation

He, Yongkang, Chen, Mingjin, Yang, Zhijing, Lu, Yongyi

This paper seeks to address the dense labeling problems where a significant fraction of the dataset can be pruned without sacrificing much accuracy. We observe that, on standard medical image segmentation benchmarks, the loss gradient norm-based metrics of individual training examples applied in image classification fail to identify the important samples. To address this issue, we propose a data pruning method by taking into consideration the training dynamics on target regions using Dynamic Average Dice (DAD) score. To the best of our knowledge, we are among the first to address the data importance in dense labeling tasks in the field of medical image analysis, making the following contributions: (1) investigating the underlying causes with rigorous empirical analysis, and (2) determining effective data pruning approach in dense labeling problems. Our solution can be used as a strong yet simple baseline to select important examples for medical image segmentation with combined data sources.

artificial intelligence, inductive learning, machine learning, (15 more...)

2308.01189

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > China > Guangdong Province (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.34)

Improve Event Extraction via Self-Training with Gradient Guidance

Xu, Zhiyang, Lee, Jay-Yoon, Huang, Lifu

Data scarcity has been the main factor that hinders the progress of event extraction. To overcome this issue, we propose a Self-Training with Feedback (STF) framework that leverages the large-scale unlabeled data and acquires feedback for each new event prediction from the unlabeled data by comparing it to the Abstract Meaning Representation (AMR) graph of the same sentence. Specifically, STF consists of (1) a base event extraction model trained on existing event annotations and then applied to large-scale unlabeled corpora to predict new event mentions as pseudo training samples, and (2) a novel scoring model that takes in each new predicted event trigger, an argument, its argument role, as well as their paths in the AMR graph to estimate a compatibility score indicating the correctness of the pseudo label. The compatibility scores further act as feedback to encourage or discourage the model learning on the pseudo labels during self-training. Experimental results on three benchmark datasets, including ACE05-E, ACE05-E+, and ERE, demonstrate the effectiveness of the STF framework on event extraction, especially event argument extraction, with significant performance gain over the base event extraction models and strong baselines. Our experimental analysis further shows that STF is a generic framework as it can be applied to improve most, if not all, event extraction models by leveraging large-scale unlabeled data, even when high-quality AMR graph annotations are not available.

artificial intelligence, computational linguistic, machine learning, (14 more...)

2205.1249

Country:

Asia > Middle East > Iraq (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(32 more...)

Genre: Research Report (0.40)

Industry:

Education (0.67)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.67)

AIHubAug-1-2023, 09:00:00 GMT

On the stepwise nature of self-supervised learning

When training common SSL algorithms, we find that the loss descends in a stepwise fashion (top left) and the learned embeddings iteratively increase in dimensionality (bottom left). Direct visualization of embeddings (right; top three PCA directions shown) confirms that embeddings are initially collapsed to a point, which then expands to a 1D manifold, a 2D manifold, and beyond concurrently with steps in the loss. It is widely believed that deep learning's stunning success is due in part to its ability to discover and extract useful representations of complex data. Self-supervised learning (SSL) has emerged as a leading framework for learning these representations for images directly from unlabeled data, similar to how LLMs learn representations for language directly from web-scraped text. Yet despite SSL's key role in state-of-the-art models such as CLIP and MidJourney, fundamental questions like "what are self-supervised image systems really learning?" and "how does that learning actually occur?"

learning, representation, ssl method, (16 more...)

AIHub

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

arXiv.org Artificial IntelligenceAug-1-2023

Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning

Wang, Xindi, Wang, Yufei, Xu, Can, Geng, Xiubo, Zhang, Bowen, Tao, Chongyang, Rudzicz, Frank, Mercer, Robert E., Jiang, Daxin

Large language models (LLMs) have shown remarkable However, despite the advantages of ICL, it is still unclear how ICL capacity for in-context learning (ICL), where learning a new task learns knowledge from the given prompts without updating its model from just a few training examples is done without being explicitly parameters. Preliminary research [1, 11] compared ICL with simple pre-trained. However, despite the success of LLMs, there has been machine learning models, such as logistic regression and shallow little understanding of how ICL learns the knowledge from the given neural networks. In this paper, we take a further step and investigate prompts. In this paper, to make progress toward understanding the learning behaviour differences between ICL and supervised learning learning behaviour of ICL, we train the same LLMs with the same (SL). Specifically, we train three LLMs with the same training data demonstration examples via ICL and supervised learning (SL), respectively, via in-context learning and supervised learning separately and analyze and investigate their performance under label perturbations their generated outputs. While SL is a well-established approach (i.e., noisy labels and label imbalance) on a range of classification that uses labelled data to train models to make accurate predictions, tasks. First, via extensive experiments, we find that gold labels ICL takes a different approach by leveraging the context of the text have significant impacts on the downstream in-context performance, to learn from unlabeled data in order to improve the accuracy of the especially for large language models; however, imbalanced predictions. By comparing the performance of ICL and SL, we gain labels matter little to ICL across all model sizes.

large language model, machine learning, natural language, (20 more...)

2307.15411

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Holtz, Chester, Chen, Pengwen, Cloninger, Alexander, Cheng, Chung-Kuan, Mishne, Gal

Semi-Supervised Laplacian Learning on Stiefel Manifolds

arXiv.org Artificial IntelligenceJul-31-2023

Motivated by the need to address the degeneracy of canonical Laplace learning algorithms in low label rates, we propose to reformulate graph-based semi-supervised learning as a nonconvex generalization of a \emph{Trust-Region Subproblem} (TRS). This reformulation is motivated by the well-posedness of Laplacian eigenvectors in the limit of infinite unlabeled data. To solve this problem, we first show that a first-order condition implies the solution of a manifold alignment problem and that solutions to the classical \emph{Orthogonal Procrustes} problem can be used to efficiently find good classifiers that are amenable to further refinement. Next, we address the criticality of selecting supervised samples at low-label rates. We characterize informative samples with a novel measure of centrality derived from the principal eigenvectors of a certain submatrix of the graph Laplacian. We demonstrate that our framework achieves lower classification error compared to recent state-of-the-art and classical semi-supervised learning methods at extremely low, medium, and high label rates. Our code is available on github\footnote{anonymized for submission}.

artificial intelligence, inductive learning, machine learning, (18 more...)

2308.00142

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(7 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.75)

Deldari, Shohreh, Spathis, Dimitris, Malekzadeh, Mohammad, Kawsar, Fahim, Salim, Flora, Mathur, Akhil

Latent Masking for Multimodal Self-supervised Learning in Health Timeseries

arXiv.org Artificial IntelligenceJul-31-2023

Limited availability of labeled data for machine learning on biomedical time-series hampers progress in the field. Self-supervised learning (SSL) is a promising approach to learning data representations without labels. However, current SSL methods require expensive computations for negative pairs and are designed for single modalities, limiting their versatility. To overcome these limitations, we introduce CroSSL (Cross-modal SSL). CroSSL introduces two novel concepts: masking intermediate embeddings from modality-specific encoders and aggregating them into a global embedding using a cross-modal aggregator. This enables the handling of missing modalities and end-to-end learning of cross-modal patterns without prior data preprocessing or time-consuming negative-pair sampling. We evaluate CroSSL on various multimodal time-series benchmarks, including both medical-grade and consumer biosignals. Our results demonstrate superior performance compared to previous SSL techniques and supervised benchmarks with minimal labeled data. We additionally analyze the impact of different masking ratios and strategies and assess the robustness of the learned representations to missing modalities. Overall, our work achieves state-of-the-art performance while highlighting the benefits of masking latent embeddings for cross-modal learning in temporal health data.

artificial intelligence, crossl, machine learning, (16 more...)

2307.16847

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.86)
Research Report > Promising Solution (0.54)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)