AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Shi, Xingjian, Mueller, Jonas, Erickson, Nick, Li, Mu, Smola, Alexander J.

arXiv.org Machine LearningNov-4-2021

We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well. Here we assemble 18 multimodal data tables that each contain some text fields and stem from a real business application. Our publicly-available benchmark enables researchers to comprehensively evaluate their own methods for supervised learning with numeric, categorical, and text features. To ensure that any single modeling strategy which performs well over all 18 datasets will serve as a practical foundation for multimodal text/tabular AutoML, the diverse datasets in our benchmark vary greatly in: sample size, problem types (a mix of classification and regression tasks), number of features (with the number of text columns ranging from 1 to 28 between datasets), as well as how the predictive signal is decomposed between text vs. numeric/categorical features (and predictive interactions thereof). Over this benchmark, we evaluate various straightforward pipelines to model such data, including standard two-stage approaches where NLP is used to featurize the text such that AutoML for tabular data can then be applied. Compared with human data science teams, the fully automated methodology that performed best on our benchmark (stack ensembling a multimodal Transformer with various tree models) also manages to rank 1st place when fit to the raw text/tabular data in two MachineHack prediction competitions and 2nd place (out of 2380 teams) in Kaggle's Mercari Price Suggestion Challenge.

benchmark, dataset, text field, (15 more...)

arXiv.org Machine Learning

2111.02705

Country:

North America > United States > California (0.04)
Asia > India (0.04)
Asia > China (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Industry: Information Technology > Software (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods

Bhardwaj, Peru, Kelleher, John, Costabello, Luca, O'Sullivan, Declan

arXiv.org Artificial IntelligenceNov-4-2021

Despite the widespread use of Knowledge Graph Embeddings (KGE), little is known about the security vulnerabilities that might disrupt their intended behaviour. We study data poisoning attacks against KGE models for link prediction. These attacks craft adversarial additions or deletions at training time to cause model failure at test time. To select adversarial deletions, we propose to use the model-agnostic instance attribution methods from Interpretable Machine Learning, which identify the training instances that are most influential to a neural model's predictions on test instances. We use these influential triples as adversarial deletions. We further propose a heuristic method to replace one of the two entities in each influential triple to generate adversarial additions. Our experiments show that the proposed strategies outperform the state-of-art data poisoning attacks on KGE models and improve the MRR degradation due to the attacks by up to 62% over the baselines.

kge model, prediction, target triple, (15 more...)

arXiv.org Artificial Intelligence

2111.0312

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Peru (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Learning for Structured Prediction

#artificialintelligenceNov-3-2021, 12:40:50 GMT

Structured prediction is the main term for supervised machine learning techniques. Those techniques are involved predicting structured objects, instead of scalar discrete or real values. Structured prediction models are normally trained by means of observed data. In which the true value is used to regulate model parameters similar to usually used supervised learning techniques. The process of prediction using a trained model and of training the aforementioned is frequently computationally infeasible.

natural language processing, prediction, representation, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)

Add feedback

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Min, Bonan, Ross, Hayley, Sulem, Elior, Veyseh, Amir Pouran Ben, Nguyen, Thien Huu, Sainz, Oscar, Agirre, Eneko, Heinz, Ilana, Roth, Dan

arXiv.org Artificial IntelligenceNov-1-2021

Large, pre-trained transformer-based language models such as BERT have drastically changed the Natural Language Processing (NLP) field. We present a survey of recent work that uses these large language models to solve NLP tasks via pre-training then fine-tuning, prompting, or text generation approaches. We also present approaches that use pre-trained language models to generate data for training augmentation or other purposes. We conclude with discussions on limitations and suggested directions for future research.

computational linguistic, language model, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2111.01243

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(28 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (0.92)
Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(5 more...)

Add feedback

Towards the Generalization of Contrastive Self-Supervised Learning

Huang, Weiran, Yi, Mingyang, Zhao, Xuyang

arXiv.org Artificial IntelligenceNov-1-2021

Recently, self-supervised learning has attracted great attention since it only requires unlabeled data for training. Contrastive learning is a popular approach for self-supervised learning and empirically performs well in practice. However, the theoretical understanding of its generalization ability on downstream tasks is not well studied. To this end, we present a theoretical explanation of how contrastive self-supervised pre-trained models generalize to downstream tasks. Concretely, we quantitatively show that the self-supervised model has generalization ability on downstream classification tasks if it embeds input data into a feature space with distinguishing centers of classes and closely clustered intra-class samples. With the above conclusion, we further explore SimCLR and Barlow Twins, which are two canonical contrastive self-supervised methods. We prove that the aforementioned feature space can be obtained via any of the methods, and thus explain their success on the generalization on downstream classification tasks. Finally, various experiments are also conducted to verify our theoretical findings.

augmented data, generalization, nullf, (13 more...)

arXiv.org Artificial Intelligence

2111.00743

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.82)

Add feedback

Clinical Evidence Engine: Proof-of-Concept For A Clinical-Domain-Agnostic Decision Support Infrastructure

Hou, Bojian, Zhang, Hao, Ladizhinsky, Gur, Ladizhinsky, Gur, Yang, Stephen, Kuleshov, Volodymyr, Wang, Fei, Yang, Qian

arXiv.org Artificial IntelligenceOct-31-2021

Abstruse learning algorithms and complex datasets increasingly characterize modern clinical decision support systems (CDSS). As a result, clinicians cannot easily or rapidly scrutinize the CDSS recommendation when facing a difficult diagnosis or treatment decision in practice. Over-trust or under-trust are frequent. Prior research has explored supporting such assessments by explaining DST data inputs and algorithmic mechanisms. This paper explores a different approach: Providing precisely relevant, scientific evidence from biomedical literature. We present a proof-of-concept system, Clinical Evidence Engine, to demonstrate the technical and design feasibility of this approach across three domains (cardiovascular diseases, autism, cancer). Leveraging Clinical BioBERT, the system can effectively identify clinical trial reports based on lengthy clinical questions (e.g., "risks of catheter infection among adult patients in intensive care unit who require arterial catheters, if treated with povidone iodine-alcohol"). This capability enables the system to identify clinical trials relevant to diagnostic/treatment hypotheses -- a clinician's or a CDSS's. Further, Clinical Evidence Engine can identify key parts of a clinical trial abstract, including patient population (e.g., adult patients in intensive care unit who require arterial catheters), intervention (povidone iodine-alcohol), and outcome (risks of catheter infection). This capability opens up the possibility of enabling clinicians to 1) rapidly determine the match between a clinical trial and a clinical question, and 2) understand the result and contexts of the trial without extensive reading. We demonstrate this potential by illustrating two example use scenarios of the system. We discuss the idea of designing DST explanations not as specific to a DST or an algorithm, but as a domain-agnostic decision support infrastructure.

clinical evidence engine, clinician, literature, (11 more...)

arXiv.org Artificial Intelligence

2111.00621

Country:

North America > United States > New York > New York County > New York City (0.05)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Autism (0.90)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Information Management (0.95)
(5 more...)

Add feedback

MetaICL: Learning to Learn In Context

Min, Sewon, Lewis, Mike, Zettlemoyer, Luke, Hajishirzi, Hannaneh

arXiv.org Artificial IntelligenceOct-29-2021

We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learn-ing on a large set of training tasks. This meta-training enables the model to more effectively learn a new task in context at test time, by simply conditioning on a few training examples with no parameter updates or task-specific templates. We experiment on a large, diverse collection of tasks consisting of 142 NLP datasets including classification, question answering, natural language inference, paraphrase detection and more, across seven different meta-training/target splits. MetaICL outperforms a range of baselines including in-context learning without meta-training and multi-task learning followed by zero-shot transfer. We find that the gains are particularly significant for target tasks that have domain shifts from the meta-training tasks, and that using a diverse set of the meta-training tasks is key to improvements. We also show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task training data, and outperforms much bigger models with nearly 8x parameters.

dataset, meta-training task, unifiedqa, (16 more...)

arXiv.org Artificial Intelligence

2110.15943

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
(2 more...)

Add feedback

GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL

Sontakke, Sumedh A, Iota, Stephen, Hu, Zizhao, Mehrjou, Arash, Itti, Laurent, Schölkopf, Bernhard

arXiv.org Artificial IntelligenceOct-28-2021

Out-of-distribution (OOD) detection is a well-studied topic in supervised learning. Extending the successes in supervised learning methods to the reinforcement learning (RL) setting, however, is difficult due to the data generating process - RL agents actively query their environment for data, and the data are a function of the policy followed by the agent. An agent could thus neglect a shift in the environment if its policy did not lead it to explore the aspect of the environment that shifted. Therefore, to achieve safe and robust generalization in RL, there exists an unmet need for OOD detection through active experimentation. Here, we attempt to bridge this lacuna by first defining a causal framework for OOD scenarios or environments encountered by RL agents in the wild. Then, we propose a novel task: that of Out-of-Task Distribution (OOTD) detection. We introduce an RL agent that actively experiments in a test environment and subsequently concludes whether it is OOTD or not. We name our method GalilAI, in honor of Galileo Galilei, as it discovers, among other causal processes, that gravitational acceleration is independent of the mass of a body. Finally, we propose a simple probabilistic neural network baseline for comparison, which extends extant Model-Based RL. We find that GalilAI outperforms the baseline significantly. See visualizations of our method https://galil-ai.github.io/

agent, causal factor, detection, (10 more...)

arXiv.org Artificial Intelligence

2110.15489

Country:

North America > United States > California (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)

Add feedback

A Guide to Generalization and Regularization in Machine Learning

#artificialintelligenceOct-27-2021, 19:30:53 GMT

Generalization and Regularization are two often terms that have the most significant role when you aim to build a robust machine learning model. The one-term refers to the model behaviour and another term is responsible for enhancing the model performance. In a straightforward way, it can be said that regularization helps the machine learning models for better generalization. In this post, we will cover each aspect of these terms and try to understand how these are linked to each other. The major points to be discussed in this article are outlined below.

coefficient, regularity, regularization, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Add feedback

Supervised Learning vs Unsupervised Learning

#artificialintelligenceOct-27-2021, 17:45:06 GMT

Supervised learning involves learning a function that maps an input to an output based on example input-output pairs. Unlike supervised learning, unsupervised learning is used to draw inferences and find patterns from input data without references to labeled outcomes. In classification models, the output is discrete. Unlike supervised learning, unsupervised learning is used to draw inferences and find patterns from input data without references to labeled outcomes. Clustering is an unsupervised technique that involves the grouping, or clustering, of data points.

draw inference and find pattern, example input-output pair, supervised learning vs unsupervised learning, (3 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.86)

Add feedback