Collaborating Authors

The Rise of the Transformers


Rise of the Transformers with Self-Attention MechanismĀ  The intention of this article is to continue in answering the questions that my friends April Rudin, Tripp Braden, Danielle Guzman and Richard Foster-Fletcher asked about the future of AI. Furthermore Irene Iyakovet interview with me about how



Transformer models have become the go-to model in most of the NLP tasks. Many transformer-based models like BERT, ROBERTa, GPT series, etc are considered as the state-of-the-art models in NLP. While NLP is overwhelming with all these models, Transformers are gaining popularity in Computer vision also. Transformers are now used for recognizing and constructing images, image encoding, and many more. While transformer models are taking over the AI field, it is also important to have a low-level understanding of these models.

What is the transformer machine learning model?


This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI. In recent years, the transformer model has become one of the main highlights of advances in deep learning and deep neural networks. It is mainly used for advanced applications in natural language processing. Google is using it to enhance its search engine results. OpenAI has used transformers to create its famous GPT-2 and GPT-3 models.

DeepLobe - Machine Learning API as a Service Platform


Day by day the number of machine learning models is increasing at a pace. With this increasing rate, it is hard for beginners to choose an effective model to perform Natural Language Understanding (NLU) and Natural Language Generation (NLG) mechanisms. Researchers across the globe are working around the clock to achieve more progress in artificial intelligence to build agile and intuitive sequence-to-sequence learning models. And in recent times transformers are one such model which gained more prominence in the field of machine learning to perform speech-to-text activities. The wide availability of other sequence-to-sequence learning models like RNNs, LSTMs, and GRU always raises a challenge for beginners when they think about transformers.

Deep Transfer Learning for NLP with Transformers


This is arguably the most important architecture for natural language processing (NLP) today. Specifically, we look at modeling frameworks such as the generative pretrained transformer (GPT), bidirectional encoder representations from transformers (BERT) and multilingual BERT (mBERT). These methods employ neural networks with more parameters than most deep convolutional and recurrent neural network models. Despite the larger size, they've exploded in popularity because they scale comparatively more effectively on parallel computing architecture. This enables even larger and more sophisticated models to be developed in practice. Until the arrival of the transformer, the dominant NLP models relied on recurrent and convolutional components. Additionally, the best sequence modeling and transduction problems, such as machine translation, rely on an encoder-decoder architecture with an attention mechanism to detect which parts of the input influence each part of the output. The transformer aims to replace the recurrent and convolutional components entirely with attention.