Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning

Yu, Yue, Shen, Jiaming, Liu, Tianqi, Qin, Zhen, Yan, Jing Nathan, Liu, Jialu, Zhang, Chao, Bendersky, Michael

arXiv.org Artificial Intelligence 

Large language models (LLMs) have shown remarkable capabilities in various natural language understanding tasks. With only a few demonstration examples, these LLMs can quickly adapt to target tasks without expensive gradient updates. Common strategies to boost such "in-context" learning ability are to ensemble multiple model decoded results and require the model to generate an explanation along with the prediction. However, these models often treat different class predictions equally and neglect the potential discrepancy between the explanations and predictions. SE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs. We design two techniques, explanation-guided ensemble, and soft probability aggregation, to mitigate the effect of unreliable explanations and improve the consistency between explanations and final predictions. Experiments on seven natural language understanding tasks and four varying-size LLMs demonstrate the effectiveness of our proposed framework. Recent advancements in Natural Language Processing (NLP) have witnessed the remarkable capabilities of Large Language Models (LLMs) (Brown et al., 2020; Tay et al., 2023; Chowdhery et al., 2022; Anil et al., 2023; Touvron et al., 2023; OpenAI, 2023). These LLMs can rapidly adapt to new tasks by learning only on a few input-output pairs (a.k.a. Yet, beyond those demonstrations, a significant facet of human learning revolves around explanations. Consequently, the integration of free-text explanations into LLM prompting holds great potentials to further enhance in-context learning performance. Recent studies have examined how to incorporate free-text explanations into LLM in-context learning scheme. For instance, the Predict-then-Explain pipeline (Lampinen et al., 2022) proposes to generate the explanation after making the prediction.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found