Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Anwar, Usman, Saparov, Abulhair, Rando, Javier, Paleka, Daniel, Turpin, Miles, Hase, Peter, Lubana, Ekdeep Singh, Jenner, Erik, Casper, Stephen, Sourbut, Oliver, Edelman, Benjamin L., Zhang, Zhaowei, Günther, Mario, Korinek, Anton, Hernandez-Orallo, Jose, Hammond, Lewis, Bigelow, Eric, Pan, Alexander, Langosco, Lauro, Korbak, Tomasz, Zhang, Heidi, Zhong, Ruiqi, hÉigeartaigh, Seán Ó, Recchia, Gabriel, Corsi, Giulio, Chan, Alan, Anderljung, Markus, Edwards, Lilian, Bengio, Yoshua, Chen, Danqi, Albanie, Samuel, Maharaj, Tegan, Foerster, Jakob, Tramer, Florian, He, He, Kasirzadeh, Atoosa, Choi, Yejin, Krueger, David

Apr-15-2024–arXiv.org Artificial Intelligence

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions.

cooperation and disincentivizing high-risk approach, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Apr-15-2024

arXiv.org PDF

Add feedback

Country:
- Africa (0.92)
- Asia > Middle East (0.92)
- Europe > United Kingdom
  - England
    - Cambridgeshire > Cambridge (0.14)
    - Oxfordshire > Oxford (0.14)
- North America
  - Canada > Ontario
    - Toronto (0.14)
  - United States
    - California (0.92)
    - Massachusetts > Middlesex County
      - Cambridge (0.13)
- South America (0.92)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (1.00)
  - Promising Solution (1.00)

Industry:
- Law > Civil Rights & Constitutional Law (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Banking & Finance > Economy (0.67)
- Materials (0.92)
- Government
  - Military (1.00)
  - Regional Government
    - Europe Government (1.00)
    - North America Government > United States Government (1.00)
- Media > News (0.92)
- Energy (0.67)
- Information Technology
  - Security & Privacy (1.00)
  - Services (1.00)
- Education > Educational Setting (0.67)
- Leisure & Entertainment > Games (1.00)
- Social Sector (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.92)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Robots > Autonomous Vehicles
    - Drones (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found