Pseudo-OOD training for robust language models

Sundararaman, Dhanasekar, Mehta, Nikhil, Carin, Lawrence

Oct-17-2022–arXiv.org Artificial Intelligence

Motivated by the above limitations, we propose a framework called POsthoc pseudo Ood REgularization Detecting Out-of-Distribution (OOD) (Goodfellow (POORE) that generates pseudo-OOD data et al., 2014; Hendrycks and Gimpel, 2016; using the trained classifier and the In-Distribution Yang et al., 2021) samples is vital for developing (IND) samples. As opposed to methods that use reliable machine learning systems for various outlier exposure, our framework doesn't rely on any industry-scale applications of natural language understanding external OOD set. Moreover, POORE can be easily (NLP) (Shen et al., 2019; Sundararaman applied to already deployed large-scale models et al., 2020) including intent understanding trained on a classification task, without requiring in conversational dialogues (Zheng et al., 2020; to re-train the classifier from scratch. In summary, Li et al., 2017), language translation (Denkowski we make the following contributions: and Lavie, 2011; Sundararaman et al., 2019), and text classification (Aggarwal and Zhai, 2012; Sundararaman 1. We propose a Mahalanobis-based context et al., 2022). For instance, a language masking scheme for generating pseudo-OOD understanding model deployed to support a chat samples that can be used during the finetuning.

detection, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Oct-17-2022

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Jordan (0.04)
- North America > United States (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)
  - Natural Language > Machine Translation (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found