Goto

Collaborating Authors

 Instructional Material


Evolving Standardization for Continual Domain Generalization over Temporal Drift

Neural Information Processing Systems

The capability of generalizing to out-of-distribution data is crucial for the deployment of machine learning models in the real world.




Augmented Memory Replay-based Continual Learning Approaches for Network Intrusion Detection

Neural Information Processing Systems

Network intrusion detection system Continual learning with shallow methods Detailed illustration of configuration changes Datasets details Data preprocessing and feature selection Task formulation Task similarity via optimal transport dataset distance Training time comparison of the proposed ECBRS with the baselines Additional experiments with anomaly detection datasets Ablation studies Implementation, hardware details, and hyperparameter selection Occurrence of task dissimilarity between two different tasks is rare Limitations and broader impact A.1 Network intrusion detection system NID comprises two parts: the training module and the anomaly detection engine. The training can be periodic or triggered by an event like decay in intrusion detection accuracy. These features are fed to the anomaly detection engine to identify anomaly pattern(s). In our work, shallow methods are the non-neural network-based approaches. BWT is the influence that learning a task ' t ' has on the performance of BWT occurs when learning a task diminishes proficiency in prior tasks.





Coherent Soft Imitation Learning Joe Watson Sandy H. Huang Nicolas Heess

Neural Information Processing Systems

Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) for the policy or inverse reinforcement learning (IRL) for the reward. Such methods enable agents to learn complex tasks from humans that are difficult to capture with hand-designed reward functions.



Vul-R2: A Reasoning LLM for Automated Vulnerability Repair

arXiv.org Artificial Intelligence

Abstract--The exponential increase in software vulnerabilities has created an urgent need for automatic vulnerability repair (A VR) solutions. Recent research has formulated A VR as a sequence generation problem and has leveraged large language models (LLMs) to address this problem. Typically, these approaches prompt or fine-tune LLMs to generate repairs for vulnerabilities directly. Although these methods show state-of-the-art performance, they face the following challenges: (1) Lack of high-quality, vulnerability-related reasoning data. Current approaches primarily rely on foundation models that mainly encode general programming knowledge. Without vulnerability-related reasoning data, they tend to fail to capture the diverse vulnerability repair patterns. Existing reinforcement learning methods often leverage intermediate execution feedback from the environment (e.g., sandbox-based execution results) to guide reinforcement learning training. In contrast, the vulnerability repair process generally lacks such intermediate, verifiable feedback, which poses additional challenges for model training. T o address these challenges, we propose to model the vulnerability repair task from a reasoning perspective and train a reasoning LLM termed Vulnerability Reasoner and Repair (V ul-R2) which consists of two key modules: (1) a domain-aware reasoning learning module, which comprises a reasoning answer construction component, a reasoning data filtering process, and a supervised fine-tuning process for learning vulnerability-related reasoning knowledge; and (2) a curriculum-based verifiable rewarded training module, which comprises dynamically reinforcement learning with verifiable rewards paradigms based on multiple-choice question answering in an easy stage and character-level matching in a hard stage. We evaluate V ul-R2 on the real-world C/C++ dataset PrimeV ul to demonstrate its effectiveness in vulnerability repair . Specifically, V ul-R2 outperforms the best baseline by 11.27% for exact match (EM) and successfully repairs 49 additional vulnerabilities. Furthermore, we demonstrate the effectiveness of the proposed paradigm, fine-tuning V ul-R2 on PrimeV ul leads to improved EM performance of 8.78% on a human curated dataset SVEN, even without additional training.