Let's start with the basics. GPT-3 stands for Generative Pretrained Transformer version 3, and it is a sequence transduction model. Simply put, sequence transduction is a technique that transforms an input sequence to an output sequence. GPT-3 is a language model, which means that, using sequence transduction, it can predict the likelihood of an output sequence given an input sequence. This can be used, for instance to predict which word makes the most sense given a text sequence.
This post tries to demonstrates how to approximate a sequence of vectors using a recurrent neural networks, in particular I will be using the LSTM architecture, The complete code used for this post could be found here. Most of the examples I found in the internet apply the LSTM architecture to natural language processing problems, and I couldn't find an example where this architecture could be used to predict continuous values. So the task here is to predict a sequence of real numbers based on previous observations. The traditional neural networks architectures can't do this, this is why recurrent neural networks were made to address this issue, as they allow to store previous information to predict future event. Next we need to prepare the data in a way that could be accepted by our model.
An increasing number of modern antivirus solutions rely on machine learning (ML) techniques to protect users from malware. While ML-based approaches, like FireEye Endpoint Security's MalwareGuard capability, have done a great job at detecting new threats, they also come with substantial development costs. Creating and curating a large set of useful features takes significant amounts of time and expertise from malware analysts and data scientists (note that in this context a feature refers to a property or characteristic of the executable that can be used to distinguish between goodware and malware). In recent years, however, deep learning approaches have shown impressive results in automatically learning feature representations for complex problem domains, like images, speech, and text. Can we take advantage of these advances in deep learning to automatically learn how to detect malware without costly feature engineering?
This article is the first installment of a two-post series on Building a machine reading comprehension system using the latest advances in deep learning for NLP. Stay tuned for the second part, where we'll introduce a pre-trained model called BERT that will take your NLP projects to the next level! In the recent past, if you specialized in natural language processing (NLP), there may have been times when you felt a little jealous of your colleagues working in computer vision. It seemed as if they had all the fun: the annual ImageNet classification challenge, Neural Style Transfer, Generative Adversarial Networks, to name a few. At last, the dry spell is over, and the NLP revolution is well underway!
Wouldn't it be cool if a computer could understand the actual human sentiment behind sarcastic texts that can sometimes even trump actual humans? Or what if computers could understand a human language so well that it can estimate a probability telling you how likely it is to encounter any random sentence that you give it? Or maybe it could generate completely fake code snippets of the Linux kernel that look so authentic that they are just as intimidating as the actual source code (well, unless you are a kernel programmer yourself)? What if computers could immaculately translate English to French or over 100 languages from all over the world? Or "see" an image and describe the items found in the photo?