Goto

Collaborating Authors

 additional analysis



6754e06e46dfa419d5afe3c9781cecad-AuthorFeedback.pdf

Neural Information Processing Systems

So,thefactthatourtraining8 data comes solely from infectious virus, which would be highly probable (or "grammatical") sequences under our9 language model (LM), isakeyfeature ofourapproach. Importantly,however,we note that,fundamentally,CSCS ispresented ingenerality here sothese methods are19 not strictly "competitor methods" in the sense that, if one were to work better, it would still be incorporable within20 theCSCSframework. "`1 rather than Euclidean": We used`1 since it has nicer properties than, e.g.,`2 in26 high-dimensional spaces(Aggarwaletal.,ICDT,2001)butotherdistance metrics couldbeempirically quantified. "theoretical44 detail"/"how the method works": We apologize for sparsity of detail. "choice of beta": We find good robustness ofβ values reasonably close to 1 (e.g, 0.5-2).56


Figure 1 Additional analysis

Neural Information Processing Systems

We answer each question below. Our strong empirical results backs our design choice. As noted in the main paper (see section 3.3 Framework Design), we learn target-to-source alignment Thus, the '+trad' must be replaced with /check. The clear improvement demonstrates its efficacy. The followings are the results: [Ours 32.0 / ADVENT 29.1 / Adaptseg R1,R4: end-to-end training The end-to-end training causes the model to diverge.



Author Response

Neural Information Processing Systems

We thank the reviewers for their valuable feedback. We will address the comments and the concerns as follows. MMAML does not use more data. MMAML does not have this assumption. We will clarify all these points in the revised paper.


Causality for Natural Language Processing

Jin, Zhijing

arXiv.org Artificial Intelligence

In the field of natural language processing (NLP), the capability to infer and reason about causality is increasingly recognized as a critical component of intelligent systems. Despite the recent advancement of large language models (LLMs) (Radford et al., 2019; Devlin et al., 2019; Brown et al., 2020; Zhang et al., 2022; OpenAI, 2023; Ignat et al., 2024, inter alia), a key question still remains: Can these models understand and reason about causality? This is a critical skill before we can trust AI agents to be integrated into decision-making systems. Moreover, even if LLMs succeed at some extent of reasoning, they still lack transparency of how their decisions are made, forming a strong need for interpretabil-ity (Luo and Specia, 2024; Räuker et al., 2023; Zou et al., 2023). T o bridge the gap, this thesis explores various facets of causal reasoning in LLMs. W e present a series of studies that collectively advance the knowledge of how well these models perform causal reasoning (Part I), how their decisions are made (Part II), how causality among learning variables influences NLP tasks (Part III), and how causality and NLP can together analyze social problems (Part IV). Below we introduce an overview of the four parts and their corresponding chapters.