Goto

Collaborating Authors

 learning method and feature engineering


Implementing Deep Learning Methods and Feature Engineering for Text Data: FastText

@machinelearnbot

Editor's note: This post is only one part of a far more thorough and in-depth original, found here, which covers much more than what is included here. The FastText model was first introduced by Facebook in 2016 as an extension and supposedly improvement of the vanilla Word2Vec model. Based on the original paper titled'Enriching Word Vectors with Subword Information' by Mikolov et al. which is an excellent read to gain an in-depth understanding of how this model works. Overall, FastText is a framework for learning word representations and also performing robust, fast and accurate text classification. The framework is open-sourced by Facebook on GitHub and claims to have the following.

  artificial intelligence, learning method and feature engineering, machine learning, (12 more...)

Implementing Deep Learning Methods and Feature Engineering for Text Data: The GloVe Model

@machinelearnbot

Editor's note: This post is only one part of a far more thorough and in-depth original, found here, which covers much more than what is included here. The GloVe model stands for Global Vectors which is an unsupervised learning model which can be used to obtain dense word vectors similar to Word2Vec. However the technique is different and training is performed on an aggregated global word-word co-occurrence matrix, giving us a vector space with meaningful sub-structures. This method was invented in Stanford by Pennington et al. and I recommend you to read the original paper on GloVe, 'GloVe: Global Vectors for Word Representation' by Pennington et al. which is an excellent read to get some perspective on how this model works. We won't cover the implementation of the model from scratch in too much detail here but if you are interested in the actual code, you can check out the official GloVe page.


Implementing Deep Learning Methods and Feature Engineering for Text Data: The Skip-gram Model

@machinelearnbot

Editor's note: This post is only one part of a far more thorough and in-depth original, found here, which covers much more than what is included here. The Skip-gram model architecture usually tries to achieve the reverse of what the CBOW model does. It tries to predict the source context words (surrounding words) given a target word (the center word). If we used the CBOW model, we get pairs of (context_window, target_word)where if we consider a context window of size 2, we have examples like ([quick, fox], brown), ([the, brown], quick), ([the, dog], lazy) and so on. Now considering that the skip-gram model's aim is to predict the context from the target word, the model typically inverts the contexts and targets, and tries to predict each context word from its target word.