๐Ÿ“ Size Matters

#artificialintelligence 

The recent emergence of pre-trained language models and transformer architectures pushed the creation of larger and larger machine learning models. Google's BERT presented attention mechanism and transformer architecture possibilities as the "next big thing" in ML, and the numbers seem surreal. OpenAI's GPT-2 set a record by processing 1.5 billion parameters, followed by Microsoft's Turing-NLG, which processed 17 billion parameters just to see the new GPT-3 processing an astonishing 175 billion parameters. To not feel complacent, just this week Microsoft announced a new release of its DeepSpeed framework (which powers Turing-NLG), which can train a model with up to a trillion parameters. That sounds insane but it really isn't.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found