How to detect novelty in textual data streams? A comparative study of existing methods

Christophe, Clément, Velcin, Julien, Cugliari, Jairo, Suignard, Philippe, Boumghar, Manel

arXiv.org Machine Learning 

Since datasets with annotation for novelty at the document and/or word level are not easily available, we present a simulation framework that allows us to create different textual datasets in which we control the way novelty occurs. We also present a benchmark of existing methods for novelty detection in textual data streams. We define a few tasks to solve and compare several state-of-the-art methods. The simulation framework allows us to evaluate their performances according to a set of limited scenarios and test their sensitivity to some parameters.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found