Goto

Collaborating Authors

 Education





Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs

Neural Information Processing Systems

The interaction is usually modeled as Markov Decision Processes (MDPs). Research on MDPs can be broadly divided into two lines based on the reward generation mechanism. The first line of work [Jaksch et al., 2010, Azar et al., 2013, 2017, He et al., 2021] considers the



WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models

Neural Information Processing Systems

Despite growing interest, much of the existing research has focused on varied unlearning method designs to boost effectiveness and efficiency. However, the inherent relationship between model weights and LLM unlearning has not been extensively examined.