Meet M6 -- 10 Trillion Parameters at 1% GPT-3's Energy Cost
I can confidently say artificial intelligence is advancing fast when a neural network 50 times larger than another can be trained at a 100 times less energy cost -- with just one year in between! On June 25, Alibaba DAMO Academy (the R&D branch of Alibaba) announced they had built M6, a large multimodal, multitasking language model with 1 trillion parameters -- already 5x GPT-3's size, which serves as the standard to measure the rate of progress for large AI models. The model was intended for multimodality and multitasking, going a step further than previous models towards general intelligence. In terms of abilities, M6 resembles GPT-3 and other similar models like Wu Dao 2.0 or MT-NGL 530B (from which we have very little information). InfoQ, a popular Chinese tech magazine compiles M6's main skills: "[It] has cognition and creativity beyond traditional AI, is good at drawing, writing, question and answer, and has broad application prospects in many fields such as e-commerce, manufacturing, literature and art."
Nov-11-2021, 03:55:22 GMT