Building for Tomorrow: Assessing the Temporal Persistence of Text Classifiers

Alkhalifa, Rabab, Kochkina, Elena, Zubiaga, Arkaitz

arXiv.org Artificial Intelligence 

A supervised text classification model relies on labelled datasets to train the model (Sebastiani, 2002). From an experimental perspective, the design and evaluation of classification models typically rely on data pertaining to fixed periods of time. Recent research demonstrates that such models, while showing competitive performance in their experimental environment, underperform when they need to classify new data that is distant in time from that observed during training (Alkhalifa and Zubiaga, 2022). This deterioration of performance has been demonstrated for different classification tasks, including topic classification (Rocha, Mourão, Pereira, Gonçalves, and Meira, 2008), sentiment classification (Lukes and Søgaard, 2018), hate speech detection (Florio, Basile, Polignano, Basile, and Patti, 2020), stance detection (Alkhalifa, Kochkina, and Zubiaga, 2021) and political ideology detection (Röttger and Pierrehumbert, 2021). This performance drop can happen for multiple reasons, including among others the evolution in language use (Smith, 2004) or the evolution of public opinion (Bonilla and Mo, 2019) and its extent may vary (Alkhalifa et al., 2021). This poses an important challenge and limitation on such models when one plans to continue using the model over a long period of time to classify new, incoming data, as can be the case with a stream of user-generated contents (Cheng, Chen, Lee, and Li, 2021).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found