Pseudo-OOD training for robust language models

Sundararaman, Dhanasekar, Mehta, Nikhil, Carin, Lawrence

arXiv.org Artificial Intelligence 

Motivated by the above limitations, we propose a framework called POsthoc pseudo Ood REgularization Detecting Out-of-Distribution (OOD) (Goodfellow (POORE) that generates pseudo-OOD data et al., 2014; Hendrycks and Gimpel, 2016; using the trained classifier and the In-Distribution Yang et al., 2021) samples is vital for developing (IND) samples. As opposed to methods that use reliable machine learning systems for various outlier exposure, our framework doesn't rely on any industry-scale applications of natural language understanding external OOD set. Moreover, POORE can be easily (NLP) (Shen et al., 2019; Sundararaman applied to already deployed large-scale models et al., 2020) including intent understanding trained on a classification task, without requiring in conversational dialogues (Zheng et al., 2020; to re-train the classifier from scratch. In summary, Li et al., 2017), language translation (Denkowski we make the following contributions: and Lavie, 2011; Sundararaman et al., 2019), and text classification (Aggarwal and Zhai, 2012; Sundararaman 1. We propose a Mahalanobis-based context et al., 2022). For instance, a language masking scheme for generating pseudo-OOD understanding model deployed to support a chat samples that can be used during the finetuning.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found