Thoughts on the Alignment Implications of Scaling Language Models

Jun-2-2021, 21:36:34 GMT–#artificialintelligence

By now, most of you have probably heard about GPT-3 and what it does. There's been a bunch of different opinions on what it means for alignment, and this post is yet another opinion from a slightly different perspective. Some background: I'm a part of EleutherAI, a decentralized research collective (read: glorified discord server - come join us on Discord for ML, alignment, and dank memes). We're best known for our ongoing effort to create a GPT-3-like large language model, and so we have a lot of experience working with transformer models and looking at scaling laws, but we also take alignment very seriously and spend a lot of time thinking about it. I also want to lay out some potential topics for future research that might be fruitful. By the way, I did consider that the scaling laws implications might be an infohazard, but I think that ship sailed the moment the GPT-3 paper went live, and since we've already been in a race for parameters for some time (see: Megatron-LM, Turing-NLG, Switch Transformer, PanGu-α/盘古α, HyperCLOVA, Wudao/悟道 2.0, among others), I don't really think this post is causing any non-negligible amount of desire for scaling.

abstraction, alignment, resolution, (14 more...)

#artificialintelligence

Jun-2-2021, 21:36:34 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found