The Download: how OpenAI tests its models, and the ethics of uterus transplants

Nov-22-2024, 13:10:00 GMT–MIT Technology Review

OpenAI has lifted the lid (just a crack) on its safety-testing processes. It has put out two papers describing how it stress-tests its powerful large language models to try to identify potential harmful or otherwise unwanted behavior, an approach known as red-teaming. The first paper describes how OpenAI directs an extensive network of human testers outside the company to vet the behavior of its models before they are released. The second presents a new way to automate parts of the testing process, using a large language model like GPT-4 to come up with novel ways to bypass its own guardrails. MIT Technology Review got an exclusive preview of the work.

large language model, machine learning, natural language, (8 more...)

MIT Technology Review

Nov-22-2024, 13:10:00 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.89)