A Little Confidence Goes a Long Way

Scoville, John, Gao, Shang, Agrawal, Devanshu, Qadrud-Din, Javed

Aug-20-2024–arXiv.org Artificial Intelligence

We introduce a group of related methods for binary classification tasks using probes of the hidden state activations in large language models (LLMs). Performance is on par with the largest and most advanced LLMs currently available, but requiring orders of magnitude fewer computational resources and not requiring labeled data. This approach involves translating class labels into a semantically rich description, spontaneous symmetry breaking of multilayer perceptron probes for unsupervised learning and inference, training probes to generate confidence scores (prior probabilities) from hidden state activations subject to known constraints via entropy maximization, and selecting the most confident probe model from an ensemble for prediction. These techniques are evaluated on four datasets using five base LLMs.

dataset, evaluation, inference, (17 more...)

arXiv.org Artificial Intelligence

Aug-20-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.83)

Industry:
- Law (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks
      - Deep Learning (1.00)
      - Perceptrons (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found