A Framework for Monitoring and Retraining Language Models in Real-World Applications

Kasundra, Jaykumar, Schulz, Claudia, Mirsafian, Melicaalsadat, Skylaki, Stavroula

arXiv.org Artificial Intelligence 

The typical model development lifecycle consists of four phases: 1) problem scoping, 2) data definition and collection, 3) model training and iterative improvement through error analysis, and 4) model deployment in production and implementation of continuous monitoring and retraining [1]. While the first three phases are typically performed in an offline setting, model deployment represents the critical step where the ML model becomes available in a production environment, a live application, where it needs to process live data and ideally sustain performance over time to keep delivering value. Model monitoring refers to the process of evaluating the quality of the production data and the performance of the model according to relevant metrics over time. When either data quality or model performance does not meet predefined criteria, a monitoring warning can be triggered, to alert the model owners. Defining an effective model monitoring and retraining strategy is key to successful ML model deployment since it can safeguard model quality over prolonged periods of time.