AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL

Yang, Rui, Lu, Yiming, Li, Wenzhe, Sun, Hao, Fang, Meng, Du, Yali, Li, Xiu, Han, Lei, Zhang, Chongjie

arXiv.org Artificial IntelligenceFeb-13-2022

Solving goal-conditioned tasks with sparse rewards using self-supervised learning is promising because of its simplicity and stability over current reinforcement learning (RL) algorithms. A recent work, called Goal-Conditioned Supervised Learning (GCSL), provides a new learning framework by iteratively relabeling and imitating self-generated experiences. In this paper, we revisit the theoretical property of GCSL -- optimizing a lower bound of the goal reaching objective, and extend GCSL as a novel offline goal-conditioned RL algorithm. The proposed method is named Weighted GCSL (WGCSL), in which we introduce an advanced compound weight consisting of three parts (1) discounted weight for goal relabeling, (2) goal-conditioned exponential advantage weight, and (3) best-advantage weight. Theoretically, WGCSL is proved to optimize an equivalent lower bound of the goal-conditioned RL objective and generates monotonically improved policies via an iterated scheme. The monotonic property holds for any behavior policies, and therefore WGCSL can be applied to both online and offline settings. To evaluate algorithms in the offline goal-conditioned RL setting, we provide a benchmark including a range of point and simulated robot domains. Experiments in the introduced benchmark demonstrate that WGCSL can consistently outperform GCSL and existing state-of-the-art offline methods in the fully offline goal-conditioned setting.

offline rl, rethinking goal-conditioned supervised learning

arXiv.org Artificial Intelligence

2202.04478

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)

Add feedback

Online Decision Transformer

Zheng, Qinqing, Zhang, Amy, Grover, Aditya

arXiv.org Artificial IntelligenceFeb-11-2022

Generative pretraining for sequence modeling has emerged as a unifying paradigm for machine learning in a number of domains and modalities, notably in language and vision (Radford et al., 2018; Chen et al., 2020; Brown et al., 2020; Lu et al., 2022). Recently, such a pretraining paradigm has been extended to offline reinforcement learning (RL) (Chen et al., 2021; Janner et al., 2021), wherein an agent is trained to autoregressively maximize the likelihood of trajectories in the offline dataset. During training, this paradigm essentially converts offline RL to a supervised learning problem (Schmidhuber, 2019; Srivastava et al., 2019; Emmons et al., 2021). However, these works present an incomplete picture as policies learned via offline RL are limited by the quality of the training dataset and need to be finetuned to the task of interest via online interactions. It remains an open question whether such supervised learning paradigm can be extended to online settings. Unlike language and perception, online finetuning for RL is fundamentally different from the pretraining phase as it involves data acquisition via exploration. The need for exploration renders traditional supervised learning objectives (e.g., mean squared error) for offline RL insufficient in the online setting. Moreover, it has been observed that for standard online algorithms, access to offline data can often have zero or even negative effect on the online performance (Nair et al., 2020). Hence, the overall pipeline for offline pretraining followed by online finetuning for RL policies needs a careful consideration of training objectives and protocols.

odt, online, trajectory, (13 more...)

arXiv.org Artificial Intelligence

2202.05607

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Indiana (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

What is K-Nearest Neighbor(KNN) ?

#artificialintelligenceFeb-10-2022, 20:07:58 GMT

K-Nearest Neighbor(KNN) algorithm is a poplar model and falls under the Supervised Learning and it can be used to solve both classification and regression problems. In this article, I would be giving you a detailed explanation and how this model works. K-Nearest Neighbor is one of the simplest Machine Learning algorithms based on Supervised Learning technique. KNN algorithm assumes the similarity between the new data and available data and put the new case into the category that is most similar to the available categories. The value of the K is very important.

algorithm, k-nearest neighbor, knn, (3 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.81)

Add feedback

Structured Prediction Problem Archive

#artificialintelligenceFeb-10-2022, 18:20:23 GMT

Structured prediction problems are one of the fundamental tools in machine learning. In order to facilitate algorithm development for their numerical solution, we collect in one place a large number of datasets in easy to read formats for a diverse set of problem classes. We provide archival links to datasets, description of the considered problems and problem formats, and a short summary of problem characteristics including size, number of instances etc. For reference we also give a non-exhaustive selection of algorithms proposed in the literature for their solution. We hope that this central repository will make benchmarking and comparison to established works easier.

algorithm, dataset, structured prediction problem archive

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Predictive Inference with Weak Supervision

Cauchois, Maxime, Gupta, Suyash, Ali, Alnur, Duchi, John

arXiv.org Machine LearningFeb-9-2022

Consider the typical supervised learning pipeline that we teach students learning statistical machine learning: we collect data in (X, Y) pairs, where Y is a label or target to be predicted; we pick a model and loss measuring the fidelity of the model to observed data; we choose the model minimizing the loss and validate it on held-out data. This picture obscures what is becoming one of the major challenges in this endeavor: that of actually collecting highquality labeled data [44, 13, 38]. Hand labeling large-scale training sets is often impractically expensive. Consider, as simple motivation, a ranking problem: a prediction is an ordered list of a set of items, yet available feedback is likely to be incomplete and partial, such as a top element (for example, in web search a user clicks on a single preferred link, or in a grocery, an individual buys one kind of milk but provides no feedback on the other brands present). Developing methods to leverage such partial and weak feedback is therefore becoming a major focus, and researchers have developed methods to transform weak and noisy labels into a dataset with strong, "gold-standard" labels [38, 56]. In this paper, we adopt this weakly labeled setting, but instead of considering model fitting and the construction of strong labels, we focus on validation, model confidence, and predictive inference, moving beyond point predictions and single labels. Our goal is to develop methods to rigorously quantify the confidence a practitioner should have in a model given only weak labels.

alg, configuration, prediction, (16 more...)

arXiv.org Machine Learning

2201.08315

Country:

North America > United States (0.28)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Government > Voting & Elections (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)

Add feedback

Zhang

AAAI ConferencesFeb-8-2022, 12:41:06 GMT

In multi-label learning, each object is represented by a single instance while associated with a set of class labels. Due to the huge (exponential) number of possible label sets for prediction, existing approaches mainly focus on how to exploit label correlations to facilitate the learning process. Nevertheless, an intrinsic characteristic of learning from multi-label data, i.e. the widely-existing class-imbalance among labels, has not been well investigated. Generally, the number of positive training instances w.r.t. each class label is far less than its negative counterparts, which may lead to performance degradation for most multi-label learning techniques. In this paper, a new multi-label learning approach named Cross-Coupling Aggregation (COCOA) is proposed, which aims at leveraging the exploitation of label correlations as well as the exploration of class-imbalance. Briefly, to induce the predictive model on each class label, one binary-class imbalance learner corresponding to the current label and several multi-class imbalance learners coupling with other labels are aggregated for prediction. Extensive experiments clearly validate the effectiveness of the proposed approach, especially in terms of imbalance-specific evaluation metrics such as F-measure and area under the ROC curve.

class label, prediction, zhang

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.63)

Add feedback

MacGlashan

AAAI ConferencesFeb-8-2022, 12:39:08 GMT

Research in learning from demonstration can generally be grouped into either imitation learning or intention learning. In imitation learning, the goal is to imitate the observed behavior of an expert and is typically achieved using supervised learning techniques. In intention learning, the goal is to learn the intention that motivated the expert's behavior and to use a planning algorithm to derive behavior. Imitation learning has the advantage of learning a direct mapping from states to actions, which bears a small computational cost. Intention learning has the advantage of behaving well in novel states, but may bear a large computational cost by relying on planning algorithms in complex tasks. In this work, we introduce receding horizon inverse reinforcement learning, in which the planning horizon induces a continuum between these two learning paradigms. We present empirical results on multiple domains that demonstrate that performing IRL with a small, but non-zero, receding planning horizon greatly decreases the computational cost of planning while maintaining superior generalization performance compared to imitation learning.

computational cost, imitation learning, learning, (3 more...)

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.65)

Add feedback

Aineto

AAAI ConferencesFeb-8-2022, 11:28:14 GMT

This paper presents a novel approach for learning strips action models from examples that compiles this inductive learning task into a classical planning task. Interestingly, the compilation approach is flexible to different amounts of available input knowledge; the learning examples can range from a set of plans (with their corresponding initial and final states) to just a pair of initial and final states (no intermediate action or state is given). Moreover, the compilation accepts partially specified action models and it can be used to validate whether the observation of a plan execution follows a given strips action model, even if this model is not fully specified.

action model, aineto, strips action model, (1 more...)

AAAI Conferences

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.33)

Add feedback

Assenmacher

AAAI ConferencesFeb-8-2022, 11:18:50 GMT

The detection of orchestrated and potentially manipulative campaigns in social media is far more meaningful than analyzing single account behaviour but also more challenging in terms of pattern recognition, data processing, and computational complexity. While supervised learning methods need an enormous amount of reliable ground truth data to find rather inflexible patterns, classical unsupervised learning techniques need a lot of computational power to handle large amount of data. This makes them infeasible for real-time analysis. In this work, we demonstrate the applicability of text stream clustering for the real-time detection of coordinated campaigns.

assenmacher, detection

AAAI Conferences

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.74)

Add feedback

Xu

AAAI ConferencesFeb-8-2022, 11:18:31 GMT

To build a French national electronic injury surveillance system based on emergency room visits, we aim to develop a coding system to classify their causes from clinical notes in free-text. Supervised learning techniques have shown good results in this area but require a large amount of expert annotated dataset which is time consuming and costly to obtain. We hypothesize that the Natural Language Processing Transformer model incorporating a generative self-supervised pre-training step can significantly reduce the required number of annotated samples for supervised fine-tuning. In this preliminary study, we test our hypothesis in the simplified problem of predicting whether a visit is the consequence of a traumatic event or not from free-text clinical notes. Using fully re-trained GPT-2 models (without OpenAI pre-trained weights), we assess the gain of applying a self-supervised pre-training phase with unlabeled notes prior to the supervised learning task.

clinical note

AAAI Conferences

Industry: Health & Medicine > Health Care Providers & Services (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback