Scalable Extraction of Training Data from (Production) Language Models

Nasr, Milad, Carlini, Nicholas, Hayase, Jonathan, Jagielski, Matthew, Cooper, A. Feder, Ippolito, Daphne, Choquette-Choo, Christopher A., Wallace, Eric, Tramèr, Florian, Lee, Katherine

Nov-28-2023–arXiv.org Artificial Intelligence

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

Nov-28-2023

arXiv.org PDF

Add feedback

Country:
- South America (1.00)
- Antarctica (0.45)
- Oceania > Australia
  - Australian Indian Ocean Territories > Territory of Cocos (Keeling) Islands (0.14)
- North America
  - The Bahamas (0.14)
  - Saint Pierre and Miquelon (0.14)
  - Canada > Ontario (0.14)
  - United States
    - California (1.00)
    - Florida (0.45)
    - Louisiana (0.45)
    - Virginia (0.28)
    - Nevada (0.14)
    - Indiana (0.14)
    - New Jersey (0.13)
    - Washington > Pierce County
      - Tacoma (0.14)
    - Michigan > Wayne County
      - Detroit (0.14)
    - New York > Richmond County
      - New York City (0.14)
- Europe
  - Norway (0.27)
  - Netherlands (0.14)
  - France (0.14)
  - Middle East (0.14)
  - Hungary (0.14)
  - Poland (0.14)
  - Switzerland (0.14)
  - Czechia (0.14)
  - Russia (0.14)
  - Holy See (0.13)
  - United Kingdom > England
    - Greater London > London (0.14)
- Asia
  - Middle East (1.00)
  - India (0.28)
  - South Korea (0.27)
  - Laos (0.27)
  - China (0.14)
  - Japan (0.14)
  - Brunei (0.14)
  - Russia (0.14)
  - Vietnam (0.14)
  - Myanmar (0.14)
  - North Korea (0.13)
  - Macao (0.13)
- Africa
  - Saint Helena, Ascension and Tristan da Cunha (0.27)
  - Middle East > Libya (0.14)
  - Côte d'Ivoire (0.13)

Genre:
- Personal (1.00)
- Research Report > New Finding (0.92)

Industry:
- Automobiles & Trucks > Manufacturer (1.00)
- Law (1.00)
- Banking & Finance > Trading (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.92)
- Machinery (0.67)
- Materials > Metals & Mining (0.67)
- Consumer Products & Services
  - Travel (1.00)
  - Food, Beverage, Tobacco & Cannabis > Beverages (1.00)
  - Hotels (0.67)
- Leisure & Entertainment
  - Gambling (1.00)
  - Sports > Hockey (0.67)
  - Games
    - Computer Games (1.00)
    - Poker (0.92)
- Education > Health & Safety
  - School Nutrition (0.67)
- Information Technology
  - Services (1.00)
  - Security & Privacy (1.00)
  - Software (0.92)
- Health & Medicine
  - Therapeutic Area (1.00)
  - Pharmaceuticals & Biotechnology (1.00)
  - Consumer Health (1.00)
- Government
  - Military (1.00)
  - Regional Government > North America Government
    - United States Government (1.00)
- Energy > Oil & Gas
  - Upstream (0.67)
- Transportation
  - Passenger (1.00)
  - Ground > Road (1.00)
  - Air (0.67)
- Media
  - Television (1.00)
  - Film (1.00)
  - News (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)