Goto

Collaborating Authors

 Large Language Model




B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory

Neural Information Processing Systems

We leverage ideas from Stochastic Realization Theory to develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an elementary composable module. The overall architecture can be used to implement models that can access short-term eidetic memory "in-context," permanent structural memory "in-weights,"


Probing the Decision Boundaries of In-context Learning in Large Language Models

Neural Information Processing Systems

Recent language models, such as GPT -3+ [Brown et al., 2020, Achiam et al., 2023], have demonstrated Recent attempts to understand in-context learning have focused on various aspects. On the practical side, research has investigated the impact of different factors on in-context learning.




Online Adaptation of Language Models with a Memory of Amortized Contexts

Neural Information Processing Systems

However, given the ever-expanding corpus of unseen documents and the large parameter space of modern LLMs, efficient adaptation is essential. To address these challenges, we propose Memory of Amortized Contexts (MAC), an efficient and effective online adaptation framework for LLMs with strong knowledge retention.



Many-shot Jailbreaking

Neural Information Processing Systems

Longer contexts present a new attack surface for adversarial attacks. In search of a "fruit-fly" of long-context vulnerabilities, we study Many-shot Jailbreaking (MSJ; Figure 1), a simple yet effective and scalable jailbreak.


SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain

Neural Information Processing Systems

The integration of synthetically generated data in the second and third steps enhances the models' capabilities in interpreting and processing legal texts, effectively reaching state-of-the-art performance and outperforming