Thoughts on the Alignment Implications of Scaling Language Models
By now, most of you have probably heard about GPT-3 and what it does. There's been a bunch of different opinions on what it means for alignment, and this post is yet another opinion from a slightly different perspective. Some background: I'm a part of EleutherAI, a decentralized research collective (read: glorified discord server - come join us on Discord for ML, alignment, and dank memes). We're best known for our ongoing effort to create a GPT-3-like large language model, and so we have a lot of experience working with transformer models and looking at scaling laws, but we also take alignment very seriously and spend a lot of time thinking about it. I also want to lay out some potential topics for future research that might be fruitful. By the way, I did consider that the scaling laws implications might be an infohazard, but I think that ship sailed the moment the GPT-3 paper went live, and since we've already been in a race for parameters for some time (see: Megatron-LM, Turing-NLG, Switch Transformer, PanGu-α/盘古α, HyperCLOVA, Wudao/悟道 2.0, among others), I don't really think this post is causing any non-negligible amount of desire for scaling.
Jun-2-2021, 21:36:34 GMT
- Technology: