Combing LDA and Word Embeddings for topic modeling


Latent Dirichlet Allocation (LDA) is a classical way to do a topic modelling. Topic modeling is a unsupervised learning and the goal is group different document to same "topic". Typical example is clustering a news to corresponding category including "Finance", "Travel", "Sport" etc. Before word embeddings we may use Bag-of-Words in most of the time. However, the world changed after Mikolov et al. introduce word2vec (one of the example of Word Embeddings) in 2013.

