new skill
An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem
Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as training time, training data, or model size increases, a phenomenon known as emergence. In this paper, we present a framework where each new ability (a skill) is represented as a basis function. We solve a simple multi-linear model in this skill-basis, finding analytic expressions for the emergence of new skills, as well as for scaling laws of the loss with training time, data size, model size, and optimal compute. We compare our detailed calculations to direct simulations of a two-layer neural network trained on multitask sparse parity, where the tasks in the dataset are distributed according to a power-law. Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network.
Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks
Dabas, Mahavir, Huynh, Tran, Billa, Nikhil Reddy, Wang, Jiachen T., Gao, Peng, Peris, Charith, Ma, Yao, Gupta, Rahul, Jin, Ming, Mittal, Prateek, Jia, Ruoxi
Large language models remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Defending against novel jailbreaks represents a critical challenge in AI safety. Adversarial training -- designed to make models robust against worst-case perturbations -- has been the dominant paradigm for adversarial robustness. However, due to optimization challenges and difficulties in defining realistic threat models, adversarial training methods often fail on newly developed jailbreaks in practice. This paper proposes a new paradigm for improving robustness against unseen jailbreaks, centered on the Adversarial Déjà Vu hypothesis: novel jailbreaks are not fundamentally new, but largely recombinations of adversarial skills from previous attacks. We study this hypothesis through a large-scale analysis of 32 attack papers published over two years. Using an automated pipeline, we extract and compress adversarial skills into a sparse dictionary of primitives, with LLMs generating human-readable descriptions. Our analysis reveals that unseen attacks can be effectively explained as sparse compositions of earlier skills, with explanatory power increasing monotonically as skill coverage grows. Guided by this insight, we introduce Adversarial Skill Compositional Training (ASCoT), which trains on diverse compositions of skill primitives rather than isolated attack instances. ASCoT substantially improves robustness to unseen attacks, including multi-turn jailbreaks, while maintaining low over-refusal rates. We also demonstrate that expanding adversarial skill coverage, not just data scale, is key to defending against novel attacks. \textcolor{red}{\textbf{Warning: This paper contains content that may be harmful or offensive in nature.
- North America > United States > Virginia (0.04)
- Africa > Kenya (0.04)
- Information Technology > Security & Privacy (1.00)
- Government > Military (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
T2S: Tokenized Skill Scaling for Lifelong Imitation Learning
Zhang, Hongquan, Gong, Jingyu, Zhang, Zhizhong, Tan, Xin, Qu, Yanyun, Xie, Yuan
The main challenge in lifelong imitation learning lies in the balance between mitigating catastrophic forgetting of previous skills while maintaining sufficient capacity for acquiring new ones. However, current approaches typically address these aspects in isolation, overlooking their internal correlation in lifelong skill acquisition. We address this limitation with a unified framework named Tokenized Skill Scaling (T2S). Specifically, by tokenizing the model parameters, the linear parameter mapping of the traditional transformer is transformed into cross-attention between input and learnable tokens, thereby enhancing model scalability through the easy extension of new tokens. Additionally, we introduce language-guided skill scaling to transfer knowledge across tasks efficiently and avoid linearly growing parameters. Extensive experiments across diverse tasks demonstrate that T2S: 1) effectively prevents catastrophic forgetting (achieving an average NBT of 1.0% across the three LIBERO task suites), 2) excels in new skill scaling with minimal increases in trainable parameters (needing only 8.0% trainable tokens in an average of lifelong tasks), and 3) enables efficient knowledge transfer between tasks (achieving an average FWT of 77.7% across the three LIBERO task suites), offering a promising solution for lifelong imitation learning.
- North America > United States (0.14)
- Asia > China > Shanghai > Shanghai (0.05)
- Asia > China > Fujian Province > Xiamen (0.04)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Preparing for kick-off at RoboCup2025: an interview with General Chair Marco Simões
The Salvador Convention Center, where RoboCup 2025 will take place. RoboCup is an international scientific initiative with the goal of advancing the state of the art of intelligent robots, AI and automation. The annual RoboCup event, where teams gather from across the globe to take part in competitions across a number of leagues, will this year take place in Brazil, from 15-21 July. We spoke to Marco Simões, one of the General Chairs of RoboCup 2025 and President of RoboCup Brazil, to find out what plans they have for the event, some new initiatives, and how RoboCup has grown in Brazil over the past ten years. RoboCup will be held in Salvador, Brazil.
- South America > Brazil > Bahia > Salvador (0.25)
- Europe > Netherlands > North Brabant > Eindhoven (0.05)
- South America > Brazil > Paraíba > João Pessoa (0.05)
An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem
Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as training time, training data, or model size increases, a phenomenon known as emergence. In this paper, we present a framework where each new ability (a skill) is represented as a basis function. We solve a simple multi-linear model in this skill-basis, finding analytic expressions for the emergence of new skills, as well as for scaling laws of the loss with training time, data size, model size, and optimal compute. We compare our detailed calculations to direct simulations of a two-layer neural network trained on multitask sparse parity, where the tasks in the dataset are distributed according to a power-law. Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network.
SkillFlow: Efficient Skill and Code Transfer Through Communication in Adapting AI Agents
Tagkopoulos, Pagkratios, Li, Fangzhou, Tagkopoulos, Ilias
AI agents are autonomous systems that can execute specific tasks based on predefined programming. Here, we present SkillFlow, a modular, technology-agnostic framework that allows agents to expand their functionality in an ad-hoc fashion by acquiring new skills from their environment or other agents. We present a theoretical model that examines under which conditions this framework would be beneficial, and we then explore SkillFlow's ability to accelerate task completion and lead to lower cumulative costs in a real-world application, namely scheduling agents for calendar events. We demonstrate that within a few iterations, SkillFlow leads to considerable (24.8%, p-value = $6.4\times10^{-3}$) gains in time and cost, especially when the communication cost is high. Finally, we draw analogies from well-studied biological systems and compare this framework to that of lateral gene transfer, a significant process of adaptation and evolution in novel environments.
- Asia > South Korea (0.04)
- North America > United States > California > Yolo County > Davis (0.04)
- Europe > United Kingdom > England > West Sussex (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation
Yao, Yuanqi, Liu, Siao, Song, Haoming, Qu, Delin, Chen, Qizhi, Ding, Yan, Zhao, Bin, Wang, Zhigang, Li, Xuelong, Wang, Dong
Building a lifelong robot that can effectively leverage prior knowledge for continuous skill acquisition remains significantly challenging. Despite the success of experience replay and parameter-efficient methods in alleviating catastrophic forgetting problem, naively applying these methods causes a failure to leverage the shared primitives between skills. T o tackle these issues, we propose Primitive Prompt Learning (PPL), to achieve lifelong robot manipulation via reusable and extensible primitives. Within our two stage learning scheme, we first learn a set of primitive prompts to represent shared primitives through multi-skills pre-training stage, where motion-aware prompts are learned to capture semantic and motion shared primitives across different skills. Secondly, when acquiring new skills in lifelong span, lifelong prompts are appended and optimized with frozen pretrained prompts, boosting the learning via knowledge transfer from old skills to new ones. F or evaluation, we construct a large-scale skill dataset and conduct extensive experiments in both simulation and real-world tasks, demonstrating PPL's superior performance over state-of-the-art methods.
INPROVF: Leveraging Large Language Models to Repair High-level Robot Controllers from Assumption Violations
Meng, Qian, Zhou, Jin Peng, Weinberger, Kilian Q., Kress-Gazit, Hadas
This paper presents INPROVF, an automatic framework that combines large language models (LLMs) and formal methods to speed up the repair process of high-level robot controllers. Previous approaches based solely on formal methods are computationally expensive and cannot scale to large state spaces. In contrast, INPROVF uses LLMs to generate repair candidates, and formal methods to verify their correctness. To improve the quality of these candidates, our framework first translates the symbolic representations of the environment and controllers into natural language descriptions. If a candidate fails the verification, INPROVF provides feedback on potential unsafe behaviors or unsatisfied tasks, and iteratively prompts LLMs to generate improved solutions. We demonstrate the effectiveness of INPROVF through 12 violations with various workspaces, tasks, and state space sizes.
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > Canada > Ontario > Toronto (0.04)
iManip: Skill-Incremental Learning for Robotic Manipulation
Zheng, Zexin, Cai, Jia-Feng, Wu, Xiao-Ming, Wei, Yi-Lin, Tang, Yu-Ming, Zheng, Wei-Shi
The development of a generalist agent with adaptive multiple manipulation skills has been a long-standing goal in the robotics community. In this paper, we explore a crucial task, skill-incremental learning, in robotic manipulation, which is to endow the robots with the ability to learn new manipulation skills based on the previous learned knowledge without re-training. First, we build a skill-incremental environment based on the RLBench benchmark, and explore how traditional incremental methods perform in this setting. We find that they suffer from severe catastrophic forgetting due to the previous methods on classification overlooking the characteristics of temporality and action complexity in robotic manipulation tasks. Towards this end, we propose an incremental Manip}ulation framework, termed iManip, to mitigate the above issues. We firstly design a temporal replay strategy to maintain the integrity of old skills when learning new skill. Moreover, we propose the extendable PerceiverIO, consisting of an action prompt with extendable weight to adapt to new action primitives in new skill. Extensive experiments show that our framework performs well in Skill-Incremental Learning. Codes of the skill-incremental environment with our framework will be open-source.
Skill Expansion and Composition in Parameter Space
Liu, Tenglong, Li, Jianxiong, Zheng, Yinan, Niu, Haoyi, Lan, Yixing, Xu, Xin, Zhan, Xianyuan
Humans excel at reusing prior knowledge to address new challenges and developing skills while solving problems. This paradigm becomes increasingly popular in the development of autonomous agents, as it develops systems that can self-evolve in response to new challenges like human beings. However, previous methods suffer from limited training efficiency when expanding new skills and fail to fully leverage prior knowledge to facilitate new task learning. In this paper, we propose Parametric Skill Expansion and Composition (PSEC), a new framework designed to iteratively evolve the agents' capabilities and efficiently address new challenges by maintaining a manageable skill library. This library can progressively integrate skill primitives as plug-and-play Low-Rank Adaptation (LoRA) modules in parameter-efficient finetuning, facilitating efficient and flexible skill expansion. This structure also enables the direct skill compositions in parameter space by merging LoRA modules that encode different skills, leveraging shared information across skills to effectively program new skills. Based on this, we propose a context-aware module to dynamically activate different skills to collaboratively handle new tasks. Empowering diverse applications including multi-objective composition, dynamics shift, and continual policy shift, the results on D4RL, DSRL benchmarks, and the DeepMind Control Suite show that PSEC exhibits superior capacity to leverage prior knowledge to efficiently tackle new challenges, as well as expand its skill libraries to evolve the capabilities. Project website: https://ltlhuuu.github.io/PSEC/.
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Research Report (1.00)
- Instructional Material (0.92)
- Education (1.00)
- Leisure & Entertainment > Games (0.67)