Collaborating Authors

NLP 101 3/3 -- Neural Architectures for NLP


In my two previous articles (here and here), despite quickly introducing word embeddings that are created using neural networks, I mainly focused on traditional machine-learning models that take one-hot encoded vectors as input. However, these one-hot encoded vectors are a very naive method of representing text and linear classifiers cannot deal with such a non-linear phenomenon as human language. This is where neural networks come in. And not just any neural network but networks that can deal with sequential data (or in general data that follows certain patterns, not necessarily sequential, but more on that in a bit). Even if you don't know exactly how a neural network works (explaining that is out of the scope of this article), I assume you have seen an image like this before: As you can see there is only one input layer, so the input data would be one-dimensional for this simple feed-forward neural network.

Computational Creativity: The Role of the Transformer


Over the years, computers have become increasingly sophisticated in their ability to identify more and more complex patterns. The field of computational creativity, a multidisciplinary endeavour to build software that can assist humans in a variety of tasks in the arts, science and the humanities, has seen much progress since the early days of computers where instructions had to be explicitly programmed. In this article, we will attempt to unravel some of the recent developments in generative modelling that have shown significant improvements in computers' ability to generate useful patterns that appeal to human observers. In particular, one type of neural network architecture, the Transformer, will be discussed in detail with regard to its ability to capture longer-term dependencies in text, music and images. Some future directions that this technology could lead to are also discussed.

Building a Text Classifier using RNN


In our Last Story, we discussed on building a text classifier without using an RNN. Here in this article, we are going to discuss on building a text classifier using Recurrent Neural Network (RNN). Let us imagine, when we were first taught to write the alphabet A on a practice sheet, we would have done something like this. During the process of writing itself, we would have realised that the pen is moving out of the line and the strokes must be changed. So if possible we will erase or at the least change the direction of pen in due course.

What is the transformer machine learning model?


This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI. In recent years, the transformer model has become one of the main highlights of advances in deep learning and deep neural networks. It is mainly used for advanced applications in natural language processing. Google is using it to enhance its search engine results. OpenAI has used transformers to create its famous GPT-2 and GPT-3 models.

Evolution of Deep learning models


None of deep learning models discussed here work as classification algorithms. Instead, they can be seen as Pretrainin, automated feature selection and learning, creating a hierarchy of features etc. Once trained (features are selected), the input vectors are transformed into a better representation and these are in turn passed on to a real classifier such as SVM or Logistic regression. This can be represented as below.