Causality for Natural Language Processing

Jin, Zhijing

arXiv.org Artificial Intelligence 

In the field of natural language processing (NLP), the capability to infer and reason about causality is increasingly recognized as a critical component of intelligent systems. Despite the recent advancement of large language models (LLMs) (Radford et al., 2019; Devlin et al., 2019; Brown et al., 2020; Zhang et al., 2022; OpenAI, 2023; Ignat et al., 2024, inter alia), a key question still remains: Can these models understand and reason about causality? This is a critical skill before we can trust AI agents to be integrated into decision-making systems. Moreover, even if LLMs succeed at some extent of reasoning, they still lack transparency of how their decisions are made, forming a strong need for interpretabil-ity (Luo and Specia, 2024; Räuker et al., 2023; Zou et al., 2023). T o bridge the gap, this thesis explores various facets of causal reasoning in LLMs. W e present a series of studies that collectively advance the knowledge of how well these models perform causal reasoning (Part I), how their decisions are made (Part II), how causality among learning variables influences NLP tasks (Part III), and how causality and NLP can together analyze social problems (Part IV). Below we introduce an overview of the four parts and their corresponding chapters.