AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

State of Self-Supervised Learning in 2022 part1

#artificialintelligenceOct-3-2022, 17:00:14 GMT

Abstract: Self-supervised learning (SSL) has achieved remarkable performance on various medical imaging tasks by dint of priors from massive unlabeled data. However, for a specific downstream task, there is still a lack of an instruction book on how to select suitable pretext tasks and implementation details. Then, we conduct extensive experiments to explore four significant issues in SSL for medical imaging, including (1) the effect of self-supervised pretraining on imbalanced datasets, (2) network architectures, (3) the applicability of upstream tasks to downstream tasks and (4) the stacking effect of SSL and commonly used policies for deep learning, including data resampling and augmentation. Based on the experimental results, potential guidelines are presented for self-supervised pretraining in medical imaging. Abstract: Lyrics recognition is an important task in music processing.

dataset, self-supervised learning, ssl model, (6 more...)

#artificialintelligence

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.39)

Add feedback

That Sounds Right: Auditory Self-Supervision for Dynamic Robot Manipulation

Thankaraj, Abitha, Pinto, Lerrel

arXiv.org Artificial IntelligenceOct-3-2022

Learning to produce contact-rich, dynamic behaviors from raw sensory data has been a longstanding challenge in robotics. Prominent approaches primarily focus on using visual or tactile sensing, where unfortunately one fails to capture high-frequency interaction, while the other can be too delicate for large-scale data collection. In this work, we propose a data-centric approach to dynamic manipulation that uses an often ignored source of information: sound. We first collect a dataset of 25k interaction-sound pairs across five dynamic tasks using commodity contact microphones. Then, given this data, we leverage self-supervised learning to accelerate behavior prediction from sound. Our experiments indicate that this self-supervised 'pretraining' is crucial to achieving high performance, with a 34.5% lower MSE than plain supervised learning and a 54.3% lower MSE over visual training. Importantly, we find that when asked to generate desired sound profiles, online rollouts of our models on a UR10 robot can produce dynamic behavior that achieves an average of 11.5% improvement over supervised learning on audio similarity metrics.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2210.01116

Country:

North America > United States > New York (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.91)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Random Text Perturbations Work, but not Always

Wang, Zhengxiang

arXiv.org Artificial IntelligenceOct-2-2022

We present three large-scale experiments on binary text matching classification task both in Chinese and English to evaluate the effectiveness and generalizability of random text perturbations as a data augmentation approach for NLP. It is found that the augmentation can bring both negative and positive effects to the test set performance of three neural classification models, depending on whether the models train on enough original training examples. This remains true no matter whether five random text editing operations, used to augment text, are applied together or separately. Our study demonstrates with strong implication that the effectiveness of random text perturbations is task specific and not generally positive.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2209.00797

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Computer Vision - Richard Szeliski

#artificialintelligenceOct-1-2022, 14:21:57 GMT

As humans, we perceive the three-dimensional structure of the world around us with apparent ease. Think of how vivid the three-dimensional percept is when you look at a vase of flowers sitting on the table next to you. You can tell the shape and translucency of each petal through the subtle patterns of light and shading that play across its surface and effortlessly segment each flower from the background of the scene (Figure 1.1). Looking at a framed group por- trait, you can easily count (and name) all of the people in the picture and even guess at their emotions from their facial appearance. Perceptual psychologists have spent decades trying to understand how the visual system works and, even though they can devise optical illusions1 to tease apart some of its principles (Figure 1.3), a complete solution to this puzzle remains elusive (Marr 1982; Palmer 1999; Livingstone 2008).

canada government, diagnostic medicine, pattern recognition, (51 more...)

#artificialintelligence

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.45)
North America > United States > New Jersey (0.45)
Europe > Spain (0.45)
(38 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > New Finding (1.00)
(4 more...)

Industry:

Transportation > Ground > Road (1.00)
Semiconductors & Electronics (1.00)
Media > Television (1.00)
(14 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
(36 more...)

Add feedback

Cut-Paste Consistency Learning for Semi-Supervised Lesion Segmentation

Yap, Boon Peng, Ng, Beng Koon

arXiv.org Artificial IntelligenceOct-1-2022

Semi-supervised learning has the potential to improve the data-efficiency of training data-hungry deep neural networks, which is especially important for medical image analysis tasks where labeled data is scarce. In this work, we present a simple semi-supervised learning method for lesion segmentation tasks based on the ideas of cut-paste augmentation and consistency regularization. By exploiting the mask information available in the labeled data, we synthesize partially labeled samples from the unlabeled images so that the usual supervised learning objective (e.g., binary cross entropy) can be applied. Additionally, we introduce a background consistency term to regularize the training on the unlabeled background regions of the synthetic images. We empirically verify the effectiveness of the proposed method on two public lesion segmentation datasets, including an eye fundus photograph dataset and a brain CT scan dataset. The experiment results indicate that our method achieves consistent and superior performance over other self-training and consistency-based methods without introducing sophisticated network components.

artificial intelligence, machine learning, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2210.00191

Country: Asia > Singapore (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A Neural Model for Regular Grammar Induction

Belcák, Peter, Hofer, David, Wattenhofer, Roger

arXiv.org Artificial IntelligenceOct-1-2022

Grammatical inference is a classical problem in computational learning theory and a topic of wider influence in natural language processing. We treat grammars as a model of computation and propose a novel neural approach to induction of regular grammars from positive and negative examples. Our model is fully explainable, its intermediate results are directly interpretable as partial parses, and it can be used to learn arbitrary regular grammars when provided with sufficient data. We find that our method consistently attains high recall and precision scores across a range of tests of varying complexity.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2209.11628

Country: Europe > Switzerland > Zürich > Zürich (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Add feedback

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

Lavoie, Samuel, Tsirigotis, Christos, Schwarzer, Max, Vani, Ankit, Noukhovitch, Michael, Kawaguchi, Kenji, Courville, Aaron

arXiv.org Artificial IntelligenceSep-30-2022

Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into $L$ simplices of $V$ dimensions each using a softmax operation. This procedure conditions the representation onto a constrained space during pretraining and imparts an inductive bias for group sparsity. For downstream classification, we formally prove that the SEM representation leads to better generalization than an unnormalized representation. Furthermore, we empirically demonstrate that SSL methods trained with SEMs have improved generalization on natural image datasets such as CIFAR-100 and ImageNet. Finally, when used in a downstream classification task, we show that SEM features exhibit emergent semantic coherence where small groups of learned features are distinctly predictive of semantically-relevant classes.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Artificial Intelligence

2204.00616

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
(8 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Few-shot Text Classification with Dual Contrastive Consistency

Sun, Liwen, Han, Jiawei

arXiv.org Artificial IntelligenceSep-29-2022

In this paper, we explore how to utilize pre-trained language model to perform few-shot text classification where only a few annotated examples are given for each class. Since using traditional cross-entropy loss to fine-tune language model under this scenario causes serious overfitting and leads to sub-optimal generalization of model, we adopt supervised contrastive learning on few labeled data and consistency-regularization on vast unlabeled data. Moreover, we propose a novel contrastive consistency to further boost model performance and refine sentence representation. After conducting extensive experiments on four datasets, we demonstrate that our model (FTCC) can outperform state-of-the-art methods and has better robustness.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2209.15069

Country:

North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)

Add feedback

Learning by Distilling Context

Snell, Charlie, Klein, Dan, Zhong, Ruiqi

arXiv.org Artificial IntelligenceSep-29-2022

Language models significantly benefit from context tokens, such as prompts or scratchpads. They perform better when prompted with informative instructions, and they acquire new reasoning capabilities by generating a scratch-pad before predicting the final answers. However, they do not \textit{internalize} these performance gains, which disappear when the context tokens are gone. Our work proposes to apply context distillation so that a language model can improve itself by internalizing these gains. Concretely, given a synthetic unlabeled input for the target task, we condition the model on ``[instructions] + [task-input]'' to predict ``[scratch-pad] + [final answer]''; then we fine-tune the same model to predict its own ``[final answer]'' conditioned on the ``[task-input]'', without seeing the ``[instructions]'' or using the ``[scratch-pad]''. We show that context distillation is a general method to train language models, and it can effectively internalize 3 types of training signals. First, it can internalize abstract task instructions and explanations, so we can iteratively update the model parameters with new instructions and overwrite old ones. Second, it can internalize step-by-step reasoning for complex tasks (e.g., 8-digit addition), and such a newly acquired capability proves to be useful for other downstream tasks. Finally, it can internalize concrete training examples, and it outperforms directly learning with gradient descent by 9\% on the SPIDER Text-to-SQL dataset; furthermore, combining context distillation operations can internalize more training examples than the context window size allows.

distillation, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2209.15189

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(3 more...)

Genre: Research Report (0.65)

Industry:

Media > Film (0.47)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Depression Symptoms Modelling from Social Media Text: A Semi-supervised Learning Approach

Farruque, Nawshad, Goebel, Randy, Sivapalan, Sudhakar, Zaiane, Osmar

arXiv.org Artificial IntelligenceSep-28-2022

A fundamental component of user-level social media language based clinical depression modelling is depression symptoms detection (DSD). Unfortunately, there does not exist any DSD dataset that reflects both the clinical insights and the distribution of depression symptoms from the samples of self-disclosed depressed population. In our work, we describe a Semi-supervised Learning (SSL) framework which uses an initial supervised learning model that leverages 1) a state-of-the-art large mental health forum text pre-trained language model further fine-tuned on a clinician annotated DSD dataset, 2) a Zero-Shot learning model for DSD, and couples them together to harvest depression symptoms related samples from our large self-curated Depression Tweets Repository (DTR). Our clinician annotated dataset is the largest of its kind. Furthermore, DTR is created from the samples of tweets in self-disclosed depressed users Twitter timeline from two datasets, including one of the largest benchmark datasets for user-level depression detection from Twitter. This further helps preserve the depression symptoms distribution of self-disclosed Twitter users tweets. Subsequently, we iteratively retrain our initial DSD model with the harvested data. We discuss the stopping criteria and limitations of this SSL process, and elaborate the underlying constructs which play a vital role in the overall SSL process. We show that we can produce a final dataset which is the largest of its kind. Furthermore, a DSD and a Depression Post Detection (DPD) model trained on it achieves significantly better accuracy than their initial version.

artificial intelligence, machine learning, social media, (15 more...)

arXiv.org Artificial Intelligence

2209.02765

Country: North America > Canada > Alberta (0.14)

Genre:

Workflow (0.96)
Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.61)

Add feedback