Language models are weak learners

Manikandan, Hariharan, Jiang, Yiding, Kolter, J Zico

Jun-24-2023–arXiv.org Artificial Intelligence

A central notion in practical and theoretical machine learning is that of a $\textit{weak learner}$, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning methods such as boosting. In this work, we illustrate that prompt-based large language models can operate effectively as said weak learners. Specifically, we illustrate the use of a large language model (LLM) as a weak learner in a boosting algorithm applied to tabular data. We show that by providing (properly sampled according to the distribution of interest) text descriptions of tabular data samples, LLMs can produce a summary of the samples that serves as a template for classification and achieves the aim of acting as a weak learner on this task. We incorporate these models into a boosting approach, which in some settings can leverage the knowledge within the LLM to outperform traditional tree-based boosting. The model outperforms both few-shot learning and occasionally even more involved fine-tuning procedures, particularly for tasks involving small numbers of data points. The results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Jun-24-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Taiwan (0.04)
- North America > United States
  - New York (0.04)
  - Wisconsin > Dane County
    - Madison (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - Illinois > Cook County
    - Chicago (0.04)
- Europe
  - Italy (0.04)
  - Portugal > Lisbon
    - Lisbon (0.06)
  - France > Auvergne-Rhône-Alpes
    - Lyon > Lyon (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Education (0.67)
- Health & Medicine > Therapeutic Area
  - Cardiology/Vascular Diseases (0.47)
  - Endocrinology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found