First Trillion Parameter Model on HuggingFace - Mixture of Experts (MoE)
Google AI's Switch Transformers model, a Mixture of Experts (MoE) model, that was released a few months ago is now available on HuggingFace. The model scales up to 1.6 trillion parameters and is now openly accessible. Click here to check out the model on HuggingFace. MoE models are considered to be the next step of Natural Language Processing (NLP) architectures that have highly efficient scalable properties. The architecture is considered similar to the classic T5 model, with a Feed Forward layer getting replaced by a Sparse Feed Forward Layer.
Nov-19-2022, 04:00:13 GMT
- Technology: