out-of-domain generalization
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Li, Hongkang, Zhang, Yihua, Zhang, Shuai, Wang, Meng, Liu, Sijia, Chen, Pin-Yu
Task arithmetic refers to editing the pre-trained model by adding a weighted sum of task vectors, each of which is the weight update from the pre-trained model to fine-tuned models for certain tasks. This approach recently gained attention as a computationally efficient inference method for model editing, e.g., multi-task learning, forgetting, and out-of-domain generalization capabilities. However, the theoretical understanding of why task vectors can execute various conceptual operations remains limited, due to the highly non-convexity of training Transformer-based models. To the best of our knowledge, this paper provides the first theoretical characterization of the generalization guarantees of task vector methods on nonlinear Transformers. We consider a conceptual learning setting, where each task is a binary classification problem based on a discriminative pattern. We theoretically prove the effectiveness of task addition in simultaneously learning a set of irrelevant or aligned tasks, as well as the success of task negation in unlearning one task from irrelevant or contradictory tasks. Moreover, we prove the proper selection of linear coefficients for task arithmetic to achieve guaranteed generalization to out-of-domain tasks. All of our theoretical results hold for both dense-weight parameters and their low-rank approximations. Although established in a conceptual setting, our theoretical findings were validated on a practical machine unlearning task using the large language model Phi-1.5 (1.3B).
- North America > United States > New Jersey (0.04)
- North America > United States > Michigan (0.04)
Evaluating and Enhancing Out-of-Domain Generalization of Task-Oriented Dialog Systems for Task Completion without Turn-level Dialog Annotations
Mosharrof, Adib, Fereidouni, Moghis, Siddique, A. B.
Traditional task-oriented dialog (ToD) systems rely heavily on labor-intensive turn-level annotations, such as dialogue states and policy labels, for training. This work explores whether large language models (LLMs) can be fine-tuned solely on natural language dialogs to perform ToD tasks, without requiring such annotations. We evaluate their ability to generalize to unseen domains and compare their performance with models trained on fully annotated data. Through extensive experiments with three open-source LLMs of varying sizes and two diverse ToD datasets, we find that models fine-tuned without turn-level annotations generate coherent and contextually appropriate responses. However, their task completion performance - measured by accurate execution of API calls - remains suboptimal, with the best models achieving only around 53% success in unseen domains. To improve task completion, we propose ZeroToD, a framework that incorporates a schema augmentation mechanism to enhance API call accuracy and overall task completion rates, particularly in out-of-domain settings. We also compare ZeroToD with fine-tuning-free alternatives, such as prompting off-the-shelf LLMs, and find that our framework enables smaller, fine-tuned models that outperform large-scale proprietary LLMs in task completion. Additionally, a human study evaluating informativeness, fluency, and task completion confirms our empirical findings. These findings suggest the feasibility of developing cost-effective, scalable, and zero-shot generalizable ToD systems for real-world applications.
- Consumer Products & Services > Restaurants (0.47)
- Information Technology > Security & Privacy (0.46)
- Energy > Oil & Gas (0.34)
GeoMatch++: Morphology Conditioned Geometry Matching for Multi-Embodiment Grasping
Wei, Yunze, Attarian, Maria, Gilitschenski, Igor
As we aspire to solve more dexterous tasks in robotics, multi-finger grasping becomes of increasing importance. However, the varying degrees of freedom (DoF) of end-effectors and high multimodality of grasping modes depending on both end-effectors and objects, still pose open challenges. Previous works in grasping focus on parallel grippers [1, 2, 3], a single multi-finger gripper [4, 5, 6, 7], or a shared policy for multiple dexterous grippers [8, 9, 10, 11]. However, even methods that explore cross-embodiment mostly focus on generalization to unseen objects, and still show limited zero-shot generalization to unseen grippers. In this work, we propose GeoMatch++, a multi-embodiment grasping method which improves out-of-domain generalization on unseen grippers by leveraging robot morphology. Intuitively, robot morphology is essential to grasping - various end-effectors may have a different number of fingers, but fingertips and palm tend to be the most frequent contact regions. Thus, we hypothesize that learning good morphology embeddings can lead to a transferable grasping policy between different robots. Our main contribution is learning geometry correlation features between objects and end-effector morphology, which improve out-of-domain grasp success by 9.64% compared to previous methods, and our method showcases a minimal decrease in performance compared to in-domain evaluation.
Out-of-Domain Generalization in Dynamical Systems Reconstruction
Göring, Niclas, Hess, Florian, Brenner, Manuel, Monfared, Zahra, Durstewitz, Daniel
In science we are interested in finding the governing equations, the dynamical rules, underlying empirical phenomena. While traditionally scientific models are derived through cycles of human insight and experimentation, recently deep learning (DL) techniques have been advanced to reconstruct dynamical systems (DS) directly from time series data. State-of-the-art dynamical systems reconstruction (DSR) methods show promise in capturing invariant and long-term properties of observed DS, but their ability to generalize to unobserved domains remains an open challenge. Yet, this is a crucial property we would expect from any viable scientific theory. In this work, we provide a formal framework that addresses generalization in DSR. We explain why and how out-of-domain (OOD) generalization (OODG) in DSR profoundly differs from OODG considered elsewhere in machine learning. We introduce mathematical notions based on topological concepts and ergodic theory to formalize the idea of learnability of a DSR model. We formally prove that black-box DL techniques, without adequate structural priors, generally will not be able to learn a generalizing DSR model. We also show this empirically, considering major classes of DSR algorithms proposed so far, and illustrate where and why they fail to generalize across the whole phase space. Our study provides the first comprehensive mathematical treatment of OODG in DSR, and gives a deeper conceptual understanding of where the fundamental problems in OODG lie and how they could possibly be addressed in practice.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
Augmentation-based Domain Generalization for Semantic Segmentation
Schwonberg, Manuel, Bouazati, Fadoua El, Schmidt, Nico M., Gottschalk, Hanno
Unsupervised Domain Adaptation (UDA) and domain generalization (DG) are two research areas that aim to tackle the lack of generalization of Deep Neural Networks (DNNs) towards unseen domains. While UDA methods have access to unlabeled target images, domain generalization does not involve any target data and only learns generalized features from a source domain. Image-style randomization or augmentation is a popular approach to improve network generalization without access to the target domain. Complex methods are often proposed that disregard the potential of simple image augmentations for out-of-domain generalization. For this reason, we systematically study the in- and out-of-domain generalization capabilities of simple, rule-based image augmentations like blur, noise, color jitter and many more. Based on a full factorial design of experiment design we provide a systematic statistical evaluation of augmentations and their interactions. Our analysis provides both, expected and unexpected, outcomes. Expected, because our experiments confirm the common scientific standard that combination of multiple different augmentations out-performs single augmentations. Unexpected, because combined augmentations perform competitive to state-of-the-art domain generalization approaches, while being significantly simpler and without training overhead. On the challenging synthetic-to-real domain shift between Synthia and Cityscapes we reach 39.5% mIoU compared to 40.9% mIoU of the best previous work. When additionally employing the recent vision transformer architecture DAFormer we outperform these benchmarks with a performance of 44.2% mIoU
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- (3 more...)
- Research Report > Experimental Study (0.48)
- Research Report > Strength High (0.34)
Doge Tickets: Uncovering Domain-general Language Models by Playing Lottery Tickets
Yang, Yi, Zhang, Chen, Wang, Benyou, Song, Dawei
Over-parameterized models, typically pretrained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive experiments are conducted on the Amazon, Mnli and OntoNotes datasets. The results show that the doge tickets obtains an improved out-of-domain generalization in comparison with a range of competitive baselines. Analysis results further hint the existence of domain-general parameters and the performance consistency of doge tickets.
OOD-Probe: A Neural Interpretation of Out-of-Domain Generalization
Zhu, Zining, Shahtalebi, Soroosh, Rudzicz, Frank
The ability to generalize out-of-domain (OOD) is an important goal for deep neural network development, and researchers have proposed many high-performing OOD generalization methods from various foundations. While many OOD algorithms perform well in various scenarios, these systems are evaluated as ``black-boxes''. Instead, we propose a flexible framework that evaluates OOD systems with finer granularity using a probing module that predicts the originating domain from intermediate representations. We find that representations always encode some information about the domain. While the layerwise encoding patterns remain largely stable across different OOD algorithms, they vary across the datasets. For example, the information about rotation (on RotatedMNIST) is the most visible on the lower layers, while the information about style (on VLCS and PACS) is the most visible on the middle layers. In addition, the high probing results correlate to the domain generalization performances, leading to further directions in developing OOD generalization systems.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- (5 more...)