AITopics

2305.10397

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning

Wang, Yu, Qiao, Pengchong, Liu, Chang, Song, Guoli, Zheng, Xiawu, Chen, Jie

Recent advances in robust semi-supervised learning (SSL) typically filter out-of-distribution (OOD) information at the sample level. We argue that an overlooked problem of robust SSL is its corrupted information on semantic level, practically limiting the development of the field. In this paper, we take an initial step to explore and propose a unified framework termed OOD Semantic Pruning (OSP), which aims at pruning OOD semantics out from in-distribution (ID) features. Specifically, (i) we propose an aliasing OOD matching module to pair each ID sample with an OOD sample with semantic overlap. (ii) We design a soft orthogonality regularization, which first transforms each ID feature by suppressing its semantic component that is collinear with paired OOD sample. It then forces the predictions before and after soft orthogonality decomposition to be consistent. Being practically simple, our method shows a strong performance in OOD detection and ID classification on challenging benchmarks. In particular, OSP surpasses the previous state-of-the-art by 13.7% on accuracy for ID classification and 5.9% on AUROC for OOD detection on TinyImageNet dataset. The source codes are publicly available at https://github.com/rain305f/OSP.

artificial intelligence, inductive learning, machine learning, (19 more...)

2305.18158

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Garg, Krishna, Chowdhury, Jishnu Ray, Caragea, Cornelia

Data Augmentation for Low-Resource Keyphrase Generation

Keyphrase generation is the task of summarizing the contents of any given article into a few salient phrases (or keyphrases). Existing works for the task mostly rely on large-scale annotated datasets, which are not easy to acquire. Very few works address the problem of keyphrase generation in low-resource settings, but they still rely on a lot of additional unlabeled data for pretraining and on automatic methods for pseudo-annotations. In this paper, we present data augmentation strategies specifically to address keyphrase generation in purely resource-constrained domains. We design techniques that use the full text of the articles to improve both present and absent keyphrase generation. We test our approach comprehensively on three datasets and show that the data augmentation strategies consistently improve the state-of-the-art performance. We release our source code at https://github.com/kgarg8/kpgen-lowres-data-aug.

computational linguistic, machine learning, natural language, (17 more...)

2305.17968

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(13 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.66)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Khan, Abid, Lee, Chia-Hao, Huang, Pinshane Y., Clark, Bryan K.

Leveraging generative adversarial networks to create realistic scanning transmission electron microscopy images

The rise of automation and machine learning (ML) in electron microscopy has the potential to revolutionize materials research through autonomous data collection and processing. A significant challenge lies in developing ML models that rapidly generalize to large data sets under varying experimental conditions. We address this by employing a cycle generative adversarial network (CycleGAN) with a reciprocal space discriminator, which augments simulated data with realistic spatial frequency information. This allows the CycleGAN to generate images nearly indistinguishable from real data and provide labels for ML applications. We showcase our approach by training a fully convolutional network (FCN) to identify single atom defects in a 4.5 million atom data set, collected using automated acquisition in an aberration-corrected scanning transmission electron microscope (STEM). Our method produces adaptable FCNs that can adjust to dynamically changing experimental variables with minimal intervention, marking a crucial step towards fully autonomous harnessing of microscopy big data.

artificial intelligence, cyclegan, machine learning, (18 more...)

doi: 10.1038/s41524-023-01042-3

2301.07743

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Africa > Central African Republic > Nana-Mambere > Bouar (0.04)

Genre: Research Report (0.82)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Becker, Evan, Pandit, Parthe, Rangan, Sundeep, Fletcher, Alyson K.

Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks

Generative Adversarial Networks (GANs) are a popular formulation to train generative models for complex high dimensional data. The standard method for training GANs involves a gradient descent-ascent (GDA) procedure on a minimax optimization problem. This procedure is hard to analyze in general due to the nonlinear nature of the dynamics. We study the local dynamics of GDA for training a GAN with a kernel-based discriminator. This convergence analysis is based on a linearization of a non-linear dynamical system that describes the GDA iterations, under an \textit{isolated points model} assumption from [Becker et al. 2022]. Our analysis brings out the effect of the learning rates, regularization, and the bandwidth of the kernel discriminator, on the local convergence rate of GDA. Importantly, we show phase transitions that indicate when the system converges, oscillates, or diverges. We also provide numerical simulations that verify our claims.

artificial intelligence, equation, machine learning, (17 more...)

2305.08277

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.61)

Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations

Oh, Yoori, Lee, Juheon, Han, Yoseob, Lee, Kyogu

Recent text-to-speech models have reached the level of generating natural speech similar to what humans say. But there still have limitations in terms of expressiveness. The existing emotional speech synthesis models have shown controllability using interpolated features with scaling parameters in emotional latent space. However, the emotional latent space generated from the existing models is difficult to control the continuous emotional intensity because of the entanglement of features like emotions, speakers, etc. In this paper, we propose a novel method to control the continuous intensity of emotions using semi-supervised learning. The model learns emotions of intermediate intensity using pseudo-labels generated from phoneme-level sequences of speech information. An embedding space built from the proposed model satisfies the uniform grid geometry with an emotional basis. The experimental results showed that the proposed method was superior in controllability and naturalness.

artificial intelligence, emotion, machine learning, (15 more...)

2211.0616

Country:

Asia > South Korea > Seoul > Seoul (0.05)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.85)

Bhusal, Gokul, Merkurjev, Ekaterina, Wei, Guo-Wei

Persistent Laplacian-enhanced Algorithm for Scarcely Labeled Data Classification

arXiv.org Artificial IntelligenceMay-25-2023

The success of many machine learning (ML) methods depends crucially on having large amounts of labeled data. However, obtaining enough labeled data can be expensive, time-consuming, and subject to ethical constraints for many applications. One approach that has shown tremendous value in addressing this challenge is semi-supervised learning (SSL); this technique utilizes both labeled and unlabeled data during training, often with much less labeled data than unlabeled data, which is often relatively easy and inexpensive to obtain. In fact, SSL methods are particularly useful in applications where the cost of labeling data is especially expensive, such as medical analysis, natural language processing (NLP), or speech recognition. A subset of SSL methods that have achieved great success in various domains involves algorithms that integrate graph-based techniques. These procedures are popular due to the vast amount of information provided by the graphical framework and the versatility of their applications. In this work, we propose an algebraic topology-based semi-supervised method called persistent Laplacian-enhanced graph MBO (PL-MBO) by integrating persistent spectral graph theory with the classical Merriman-Bence- Osher (MBO) scheme. Specifically, we use a filtration procedure to generate a sequence of chain complexes and associated families of simplicial complexes, from which we construct a family of persistent Laplacians. Overall, it is a very efficient procedure that requires much less labeled data to perform well compared to many ML techniques, and it can be adapted for both small and large datasets. We evaluate the performance of the proposed method on data classification, and the results indicate that the proposed technique outperforms other existing semi-supervised algorithms.

artificial intelligence, laplacian, machine learning, (15 more...)

2305.16239

Country:

North America > United States > Michigan (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Texas (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Zheng, Yandan, Hao, Anran, Luu, Anh Tuan

Jointprop: Joint Semi-supervised Learning for Entity and Relation Extraction with Heterogeneous Graph-based Propagation

arXiv.org Artificial IntelligenceMay-25-2023

Semi-supervised learning has been an important approach to address challenges in extracting entities and relations from limited data. However, current semi-supervised works handle the two tasks (i.e., Named Entity Recognition and Relation Extraction) separately and ignore the cross-correlation of entity and relation instances as well as the existence of similar instances across unlabeled data. To alleviate the issues, we propose Jointprop, a Heterogeneous Graph-based Propagation framework for joint semi-supervised entity and relation extraction, which captures the global structure information between individual tasks and exploits interactions within unlabeled data. Specifically, we construct a unified span-based heterogeneous graph from entity and relation candidates and propagate class labels based on confidence scores. We then employ a propagation learning scheme to leverage the affinities between labelled and unlabeled samples. Experiments on benchmark datasets show that our framework outperforms the state-of-the-art semi-supervised approaches on NER and RE tasks. We show that the joint semi-supervised learning of the two tasks benefits from their codependency and validates the importance of utilizing the shared information between unlabeled data.

computational linguistic, machine learning, natural language, (16 more...)

2305.15872

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Portugal > Lisbon > Lisbon (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)

Huang, Zhe, Sidhom, Mary-Joy, Wessler, Benjamin S., Hughes, Michael C.

Fix-A-Step: Semi-supervised Learning from Uncurated Unlabeled Data

arXiv.org Artificial IntelligenceMay-25-2023

Semi-supervised learning (SSL) promises improved accuracy compared to training classifiers on small labeled datasets by also training on many unlabeled images. In real applications like medical imaging, unlabeled data will be collected for expediency and thus uncurated: possibly different from the labeled set in classes or features. Unfortunately, modern deep SSL often makes accuracy worse when given uncurated unlabeled data. Recent complex remedies try to detect out-of-distribution unlabeled images and then discard or downweight them. Instead, we introduce Fix-A-Step, a simpler procedure that views all uncurated unlabeled images as potentially helpful. Our first insight is that even uncurated images can yield useful augmentations of labeled data. Second, we modify gradient descent updates to prevent optimizing a multi-task SSL loss from hurting labeled-set accuracy. Fix-A-Step can repair many common deep SSL methods, improving accuracy on CIFAR benchmarks across all tested methods and levels of artificial class mismatch. On a new medical SSL benchmark called Heart2Heart, Fix-A-Step can learn from 353,500 truly uncurated ultrasound images to deliver gains that generalize across hospitals.

artificial intelligence, fix-a-step, machine learning, (15 more...)

2208.1187

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > France (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Li, Yuxiao, Mazuelas, Santiago, Shen, Yuan

A Semi-Supervised Learning Approach for Ranging Error Mitigation Based on UWB Waveform

arXiv.org Artificial IntelligenceMay-23-2023

Localization systems based on ultra-wide band (UWB) measurements can have unsatisfactory performance in harsh environments due to the presence of non-line-of-sight (NLOS) errors. Learning-based methods for error mitigation have shown great performance improvement via directly exploiting the wideband waveform instead of handcrafted features. However, these methods require data samples fully labeled with actual measurement errors for training, which leads to time-consuming data collection. In this paper, we propose a semi-supervised learning method based on variational Bayes for UWB ranging error mitigation. Combining deep learning techniques and statistic tools, our method can efficiently accumulate knowledge from both labeled and unlabeled data samples. Extensive experiments illustrate the effectiveness of the proposed method under different supervision rates, and the superiority compared to other fully supervised methods even at a low supervision rate.

artificial intelligence, machine learning, supervision rate, (18 more...)

doi: 10.1109/MILCOM52596.2021.9653043.

2305.18208

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)