AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective

Liu, Alexander H., Yeh, Sung-Lin, Glass, James

arXiv.org Artificial IntelligenceJan-16-2024

Existing studies on self-supervised speech representation learning have focused on developing new training methods and applying pre-trained models for different applications. However, the quality of these models is often measured by the performance of different downstream tasks. How well the representations access the information of interest is less studied. In this work, we take a closer look into existing self-supervised methods of speech from an information-theoretic perspective. We aim to develop metrics using mutual information to help practical problems such as model design and selection. We use linear probes to estimate the mutual information between the target information and learned representations, showing another insight into the accessibility to the target information from speech representations. Further, we explore the potential of evaluating representations in a self-supervised fashion, where we estimate the mutual information between different parts of the data without using any labels. Finally, we show that both supervised and unsupervised measures echo the performance of the models on layer-wise linear probing and speech recognition.

information, mutual information, representation, (12 more...)

arXiv.org Artificial Intelligence

2401.08833

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.41)

Add feedback

Consistency of semi-supervised learning, stochastic tug-of-war games, and the p-Laplacian

Calder, Jeff, Drenska, Nadejda

arXiv.org Artificial IntelligenceJan-14-2024

In this paper we give a broad overview of the intersection of partial differential equations (PDEs) and graph-based semi-supervised learning. The overview is focused on a large body of recent work on PDE continuum limits of graph-based learning, which have been used to prove well-posedness of semi-supervised learning algorithms in the large data limit. We highlight some interesting research directions revolving around consistency of graph-based semi-supervised learning, and present some new results on the consistency of p-Laplacian semi-supervised learning using the stochastic tug-of-war game interpretation of the p-Laplacian. We also present the results of some numerical experiments that illustrate our results and suggest directions for future work.

graph, learning, semi-supervised learning, (16 more...)

arXiv.org Artificial Intelligence

2401.07463

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Minnesota (0.04)
(4 more...)

Genre: Research Report > New Finding (0.86)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules

Liu, Zhiyuan, Shi, Yaorui, Zhang, An, Zhang, Enzhi, Kawaguchi, Kenji, Wang, Xiang, Chua, Tat-Seng

arXiv.org Artificial IntelligenceJan-14-2024

Scrutinizing previous studies, we can reveal a common scheme consisting of three key components: (1) graph tokenizer, which breaks a molecular graph into smaller fragments (i.e., subgraphs) and converts them into tokens; (2) graph masking, which corrupts the graph with masks; (3) graph autoencoder, which first applies an encoder on the masked graph to generate the representations, and then employs a decoder on the representations to recover the tokens of the original graph. However, the previous MGM studies focus extensively on graph masking and encoder, while there is limited understanding of tokenizer and decoder. To bridge the gap, we first summarize popular molecule tokenizers at the granularity of node, edge, motif, and Graph Neural Networks (GNNs), and then examine their roles as the MGM's reconstruction targets. Further, we explore the potential of adopting an expressive decoder in MGM. Our results show that a subgraph-level tokenizer and a sufficiently expressive decoder with remask decoding have a large impact on the encoder's representation learning. Finally, we propose a novel MGM method SimSGT, featuring a Simple GNN-based Tokenizer (SGT) and an effective decoding strategy. We empirically validate that our method outperforms the existing molecule self-supervised learning methods. Our codes and checkpoints are available at https://github.com/syr-cn/SimSGT.

jziwgtooatszmryarw8q ud1ugxv, latexit sha1, tokenizer, (17 more...)

arXiv.org Artificial Intelligence

2310.14753

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)

Add feedback

Representation Learning for Weakly Supervised Relation Extraction

Li, Zhuang

arXiv.org Artificial IntelligenceJan-14-2024

Recent years have seen rapid development in Information Extraction, as well as its subtask, Relation Extraction. Relation Extraction is able to detect semantic relations between entities in sentences. Currently, many efficient approaches have been applied to relation extraction tasks. Supervised learning approaches especially have good performance. However, there are still many difficult challenges. One of the most serious problems is that manually labeled data is difficult to acquire. In most cases, limited data for supervised approaches equals lousy performance. Thus here, under the situation with only limited training data, we focus on how to improve the performance of our supervised baseline system with unsupervised pre-training. Feature is one of the key components in improving the supervised approaches. Traditional approaches usually apply hand-crafted features, which require expert knowledge and expensive human labor. However, this type of feature might suffer from data sparsity: when the training set size is small, the model parameters might be poorly estimated. In this thesis, we present several novel unsupervised pre-training models to learn the distributed text representation features, which are encoded with rich syntactic-semantic patterns of relation expressions. The experiments have demonstrated that this type of feature, combine with the traditional hand-crafted features, could improve the performance of the logistic classification model for relation extraction, especially on the classification of relations with only minor training instances.

f1measure 0, relation, representation, (12 more...)

arXiv.org Artificial Intelligence

2105.00815

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Minimally Supervised Learning using Topological Projections in Self-Organizing Maps

Lyu, Zimeng, Ororbia, Alexander, Li, Rui, Desell, Travis

arXiv.org Artificial IntelligenceJan-12-2024

Parameter prediction is essential for many applications, facilitating insightful interpretation and decision-making. However, in many real life domains, such as power systems, medicine, and engineering, it can be very expensive to acquire ground truth labels for certain datasets as they may require extensive and expensive laboratory testing. In this work, we introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs), which significantly reduces the required number of labeled data points to perform parameter prediction, effectively exploiting information contained in large unlabeled datasets. Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are ultimately assigned to key best matching units (BMU). The values estimated for newly-encountered data points are computed utilizing the average of the $n$ closest labeled data points in the SOM's U-matrix in tandem with a topological shortest path distance calculation scheme. Our results indicate that the proposed semi-supervised model significantly outperforms traditional regression techniques, including linear and polynomial regression, Gaussian process regression, K-nearest neighbors, as well as various deep neural network models.

neighbor, prediction, regression, (14 more...)

arXiv.org Artificial Intelligence

2401.06923

Genre: Research Report > New Finding (0.34)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Automated Machine Learning for Positive-Unlabelled Learning

Saunders, Jack D., Freitas, Alex A.

arXiv.org Artificial IntelligenceJan-12-2024

Positive-Unlabelled (PU) learning is a growing field of machine learning that aims to learn classifiers from data consisting of labelled positive and unlabelled instances, which can be in reality positive or negative, but whose label is unknown. An extensive number of methods have been proposed to address PU learning over the last two decades, so many so that selecting an optimal method for a given PU learning task presents a challenge. Our previous work has addressed this by proposing GA-Auto-PU, the first Automated Machine Learning (Auto-ML) system for PU learning. In this work, we propose two new Auto-ML systems for PU learning: BO-Auto-PU, based on a Bayesian Optimisation approach, and EBO-Auto-PU, based on a novel evolutionary/Bayesian optimisation approach. We also present an extensive evaluation of the three Auto-ML systems, comparing them to each other and to well-established PU learning methods across 60 datasets (20 real-world datasets, each with 3 versions in terms of PU learning characteristics).

auto-pu system, classifier, search space, (15 more...)

arXiv.org Artificial Intelligence

2401.06452

Country:

South America > Paraguay > Asunción > Asunción (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Wisconsin (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)
Overview (0.92)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(3 more...)

Add feedback

LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning

Zhang, Jifan, Chen, Yifang, Canal, Gregory, Mussmann, Stephen, Das, Arnav M., Bhatt, Gantavya, Zhu, Yinglun, Bilmes, Jeffrey, Du, Simon Shaolei, Jamieson, Kevin, Nowak, Robert D

arXiv.org Artificial IntelligenceJan-12-2024

Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive. To mitigate this cost, machine learning methods, such as transfer learning, semi-supervised learning and active learning, aim to be label-efficient: achieving high predictive performance from relatively few labeled examples. While obtaining the best label-efficiency in practice often requires combinations of these techniques, existing benchmark and evaluation frameworks do not capture a concerted combination of all such techniques. This paper addresses this deficiency by introducing LabelBench, a new computationally-efficient framework for joint evaluation of multiple label-efficient learning techniques. As an application of LabelBench, we introduce a novel benchmark of state-of-the-art active learning methods in combination with semi-supervised learning for fine-tuning pretrained vision transformers. Our benchmark demonstrates better label-efficiencies than previously reported in active learning. LabelBench's modular codebase is open-sourced for the broader community to contribute label-efficient learning methods and benchmarks. The repository can be found at: https://github.com/EfficientTraining/LabelBench.

accuracy, learning, pretrained model, (15 more...)

arXiv.org Artificial Intelligence

2306.0991

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > California > Riverside County > Riverside (0.14)
(5 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model

Chen, Yuqi, Ren, Kan, Song, Kaitao, Wang, Yansen, Wang, Yifan, Li, Dongsheng, Qiu, Lili

arXiv.org Artificial IntelligenceJan-11-2024

Self-supervised learning has emerged as a highly effective approach in the fields of natural language processing and computer vision. It is also applicable to brain signals such as electroencephalography (EEG) data, given the abundance of available unlabeled data that exist in a wide spectrum of real-world medical applications ranging from seizure detection to wave analysis. The existing works leveraging self-supervised learning on EEG modeling mainly focus on pretraining upon each individual dataset corresponding to a single downstream task, which cannot leverage the power of abundant data, and they may derive sub-optimal solutions with a lack of generalization. Moreover, these methods rely on end-to-end model learning which is not easy for humans to understand. In this paper, we present a novel EEG foundation model, namely EEGFormer, pretrained on large-scale compound EEG data. The pretrained model cannot only learn universal representations on EEG signals with adaptable performance on various downstream tasks but also provide interpretable outcomes of the useful patterns within the data. To validate the effectiveness of our model, we extensively evaluate it on various downstream tasks and assess the performance under different transfer settings. Furthermore, we demonstrate how the learned model exhibits transferable anomaly detection performance and provides valuable interpretability of the acquired patterns via self-supervised learning.

arxiv preprint arxiv, eeg data, representation, (13 more...)

arXiv.org Artificial Intelligence

2401.10278

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.76)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning

Li, Xiang, Belagali, Varun, Shang, Jinghuan, Ryoo, Michael S.

arXiv.org Artificial IntelligenceJan-11-2024

Sequence modeling approaches have shown promising results in robot imitation learning. Recently, diffusion models have been adopted for behavioral cloning in a sequence modeling fashion, benefiting from their exceptional capabilities in modeling complex data distributions. The standard diffusion-based policy iteratively generates action sequences from random noise conditioned on the input states. Nonetheless, the model for diffusion policy can be further improved in terms of visual representations. In this work, we propose Crossway Diffusion, a simple yet effective method to enhance diffusion-based visuomotor policy learning via a carefully designed state decoder and an auxiliary self-supervised learning (SSL) objective. The state decoder reconstructs raw image pixels and other state information from the intermediate representations of the reverse diffusion process. The whole model is jointly optimized by the SSL objective and the original diffusion loss. Our experiments demonstrate the effectiveness of Crossway Diffusion in various simulated and real-world robot tasks, confirming its consistent advantages over the standard diffusion-based policy and substantial improvements over the baselines.

crossway diffusion, proceedings, state decoder, (12 more...)

arXiv.org Artificial Intelligence

2307.01849

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.61)

Add feedback

FourCastNeXt: Improving FourCastNet Training with Limited Compute

Guo, Edison, Ahmed, Maruf, Sun, Yue, Mahendru, Rahul, Yang, Rui, Cook, Harrison, Leeuwenburg, Tennessee, Evans, Ben

arXiv.org Artificial IntelligenceJan-10-2024

Recently, the FourCastNet Neural Earth System Model (NESM) by (Pathak et al., 2022) has shown impressive results on predicting various atmospheric variables, trained on the ERA5 reanalysis dataset. While FourCastNet enjoys quasi-linear time and memory complexity in sequence length compared to quadratic complexity in vanilla transformers, training FourCastNet on ERA5 from scratch still requires large amount of compute resources, which is expensive or even inaccessible to most researchers. In this work, we will show improved methods that can train FourCastNet using only 1% of the compute required by the baseline, while maintaining model performance or par or even better than the baseline. In this report, we will provide technical details of our methodologies along with experimental results and ablation study of different components of our methods. We have called our improved model FourCastNeXt, in a similar spirit to ConvNeXt (Liu et al., 2022).

baseline, fourcastnet, time step, (13 more...)

arXiv.org Artificial Intelligence

2401.05584

Country:

North America > United States > Tennessee (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.32)

Add feedback