Goto

Collaborating Authors

 few-shot adaptation


Optimization Inspired Few-Shot Adaptation for Large Language Models

Neural Information Processing Systems

Large Language Models (LLMs) have demonstrated remarkable performance in real-world applications. However, adapting LLMs to novel tasks via finetuning often requires substantial training data and computational resources that are impractical in few-shot scenarios. Existing approaches, such as In-context learning and Parameter-Efficient Fine-Tuning (PEFT), face key limitations: Incontext learning introduces additional inference computational overhead with limited performance gains, while PEFT models are prone to overfitting on the few demonstration examples.


Optimization Inspired Few-Shot Adaptation for Large Language Models

Neural Information Processing Systems

Large Language Models (LLMs) have demonstrated remarkable performance in real-world applications. However, adapting LLMs to novel tasks via fine-tuning often requires substantial training data and computational resources that are impractical in few-shot scenarios. Existing approaches, such as In-context learning and Parameter-Efficient Fine-Tuning (PEFT), face key limitations: In-context learning introduces additional inference computational overhead with limited performance gains, while PEFT models are prone to overfitting on the few demonstration examples.



Overleaf Example

Neural Information Processing Systems

Large transformer-based foundation models have been commonly used as pre-trained models that can be adapted to different challenging datasets and settings with state-of-the-art generalization performance.



Appendix

Neural Information Processing Systems

For both the RBF and the Matรจrn-3/2 kernels, we consider three possible ranges of lengthscales, including [0.07,0.13],[0.17,0.23],[0.27,0.33]. Forbothtraining andthefastadaptation during testing, weapply 5-shot adaptation (i.e., 5black-box functions areused foradaptation) and set the number offew-shot gradient updates tobe5. Specifically, the amount of translation added to each dimension ofx is selected from the range [ 0.1xlim,0.1xlim]uniformly To address the continuous input domains and achieve a fair comparison between FSAF and MetaBO, we leverage the hierarchical gridding method similar to that in [27] for the maximization procedure of the AFs. The validation set is used for both few-shot adaptation of FSAF as well as finding the lengthscale parameter of the GP surrogate modelforposteriorinference.


Multi-Head Adapter Routing for Cross-Task Generalization

Neural Information Processing Systems

Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists in pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] ($\texttt{Poly}$) jointly learns an inventory of adapters and a *routing* function that selects a (variable-size) subset of adapters for each task during both pre-training and few-shot adaptation. In this paper, we investigate the role that adapter routing plays in its success and design new variants based on our findings.First, we build on the intuition that finer-grained routing provides more expressivity. Hence,we propose $\texttt{MHR}$ (Multi-Head Routing) which combines *subsets* of adapter parameters and outperforms $\texttt{Poly}$ under a comparable parameter budget; by only fine-tuning the routing function and not the adapters ($\texttt{MHR}$-$z$) we achieve competitive performance with extreme parameter efficiency. Second, we find that $\texttt{Poly}$/$\texttt{MHR}$ performance is a result of better multi-task optimization, rather than modular inductive biases that facilitate adapter recombination and local adaptation, as previously hypothesized.


Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation

Neural Information Processing Systems

Learning new task-specific skills from a few trials is a fundamental challenge for artificial intelligence. Meta reinforcement learning (meta-RL) tackles this problem by learning transferable policies that support few-shot adaptation to unseen tasks. Despite recent advances in meta-RL, most existing methods require the access to the environmental reward function of new tasks to infer the task objective, which is not realistic in many practical applications. To bridge this gap, we study the problem of few-shot adaptation in the context of human-in-the-loop reinforcement learning. We develop a meta-RL algorithm that enables fast policy adaptation with preference-based feedback. The agent can adapt to new tasks by querying human's preference between behavior trajectories instead of using per-step numeric rewards. By extending techniques from information theory, our approach can design query sequences to maximize the information gain from human interactions while tolerating the inherent error of non-expert human oracle. In experiments, we extensively evaluate our method, Adaptation with Noisy OracLE (ANOLE), on a variety of meta-RL benchmark tasks and demonstrate substantial improvement over baseline algorithms in terms of both feedback efficiency and error tolerance.


H-Zero: Cross-Humanoid Locomotion Pretraining Enables Few-shot Novel Embodiment Transfer

arXiv.org Artificial Intelligence

The rapid advancement of humanoid robotics has intensified the need for robust and adaptable controllers to enable stable and efficient locomotion across diverse platforms. However, developing such controllers remains a significant challenge because existing solutions are tailored to specific robot designs, requiring extensive tuning of reward functions, physical parameters, and training hyperparameters for each embodiment. To address this challenge, we introduce H-Zero, a cross-humanoid locomotion pretraining pipeline that learns a generalizable humanoid base policy. We show that pretraining on a limited set of embodiments enables zero-shot and few-shot transfer to novel humanoid robots with minimal fine-tuning. Evaluations show that the pretrained policy maintains up to 81% of the full episode duration on unseen robots in simulation while enabling few-shot transfer to unseen humanoids and upright quadrupeds within 30 minutes of fine-tuning.


Overleaf Example

Neural Information Processing Systems

Large transformer-based foundation models have been commonly used as pre-trained models that can be adapted to different challenging datasets and settings with state-of-the-art generalization performance.