Clustering Similar Stories Using LDA -- Flipboard Engineering

#artificialintelligence 

There is more to a story than meets the eye, and some stories deserve to be presented from more than just one perspective. With Flipboard 4.0, we have released story roundups, a new feature that adds coverage from multiple sources to a story and provides you with a fuller picture of an event. With our scale of millions of articles and constant stream of documents, it's impossible to generate these roundups manually. So, we have developed a clustering algorithm that's both fast and scalable, and in this blog post, I will explain how we create these roundups on Flipboard. Although there are many sophisticated automatic clustering algorithms, such as K-means or Agglomerative clustering, story clustering is a non-trivial problem.