Topic Modeling and Latent Dirichlet Allocation (LDA) in Python
Topic modeling is a type of statistical modeling for discovering the abstract "topics" that occur in a collection of documents. Latent Dirichlet Allocation (LDA) is an example of topic model and is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modeled as Dirichlet distributions. Here we are going to apply LDA to a set of documents and split them into topics. The data set we'll use is a list of over one million news headlines published over a period of 15 years and can be downloaded from Kaggle.
Jun-3-2018, 09:52:54 GMT
- Technology: