PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales

Wang, Peifeng, Chan, Aaron, Ilievski, Filip, Chen, Muhao, Ren, Xiang

arXiv.org Artificial Intelligence 

Neural language models (LMs) have achieved impressive results on various language-based reasoning tasks by utilizing latent knowledge encoded in their own pretrained parameters. To make this reasoning process more explicit, recent works retrieve a rationalizing LM's internal knowledge by training or prompting it to generate free-text rationales, which can be used to guide task predictions made by either the same LM or a separate reasoning LM. However, rationalizing LMs require expensive rationale annotation and/or computation, without any assurance that their generated rationales improve LM task performance or faithfully reflect LM decision-making. In this paper, we propose PINTO, an LM pipeline that rationalizes via prompt-based learning, and learns to faithfully reason over rationales via counterfactual regularization. First, PINTO maps out a suitable reasoning process for the task input by prompting a frozen rationalizing LM to generate a free-text rationale. Second, PINTO's reasoning LM is fine-tuned to solve the task using the generated rationale as context, while regularized to output less confident predictions when the rationale is perturbed. Across four datasets, we show that PINTO significantly improves the generalization ability of the reasoning LM, yielding higher performance on both in-distribution and out-of-distribution test sets. Also, we find that PINTO's rationales are more faithful to its task predictions than those generated by competitive baselines. Many language-based reasoning tasks require retrieving and reasoning over knowledge beyond the task input--e.g., commonsense reasoning and closed-book QA (Figure 1, left) (Talmor et al., 2018; Mihaylov et al., 2018). Neural language models (LMs) have achieved impressive results on such tasks by utilizing latent knowledge encoded in their pretrained parameters (Raffel et al., 2020b; Brown et al., 2020). Still, given LMs' black-box nature, it is unclear whether this knowledge is being used properly (Doshi-Velez & Kim, 2017; Lipton, 2018). Previous studies have shown that LMs often learn spurious correlations from artifacts in downstream training data, thus limiting their generalizability (Branco et al., 2021; Geirhos et al., 2020; D'Amour et al., 2020). With this in mind, a number of prior works aim to make LMs' reasoning processes more explicit by generating free-text rationales, which use LMs' internal knowledge to describe a reasoning process in natural language (Narang et al., 2020; Wei et al., 2022b; Marasović et al., 2022; Zelikman et al., 2022). In the fine-tuned self-rationalizing paradigm, a single LM is fine-tuned to jointly generate the task output and rationale (Narang et al., 2020; Marasović et al., 2022; Zelikman et al., 2022). In the prompted self-rationalizing paradigm, a single LM is instead frozen and prompted to jointly generate the task output and rationale, with the prompt consisting of a few input-output-rationale demonstrations (Wei et al., 2022b).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found