Goto

Collaborating Authors

 cityscape


Continual Optimization with Symmetry Teleportation for Multi-Task Learning

Neural Information Processing Systems

Multi-task learning (MTL) is a widely explored paradigm that enables the simultaneous learning of multiple tasks using a single model. Despite numerous solutions, the key issues of optimization conflict and task imbalance remain under-addressed, limiting performance. Unlike existing optimization-based approaches that typically reweight task losses or gradients to mitigate conflicts or promote progress, we propose a novel approach based on Continual Optimization with Symmetry Teleportation (COST). During MTL optimization, when an optimization conflict arises, we seek an alternative loss-equivalent point on the loss landscape to reduce conflict. Specifically, we utilize a low-rank adapter (LoRA) to facilitate this practical teleportation by designing convergent, loss-invariant objectives. Additionally, we introduce a historical trajectory reuse strategy to continually leverage the benefits of advanced optimizers. Extensive experiments on multiple mainstream datasets demonstrate the effectiveness of our approach. COSTis a plug-and-play solution that enhances a wide range of existing MTL methods. When integrated with state-of-the-art methods, COSTachieves superior performance.




Appendix Implementation Details

Neural Information Processing Systems

A.1 Network Architectures We adopt Daformer [17] with Swin-B or MiT-B5 backbone as the base semantic segmentation architecture. For the segmentation head, we utilize the same head as Daformer [17]. The stem module contains one fully-convolutional layers with kernel 3 3 and stride of 2, two fully-convolutional layers with kernel 3 3 and stride of 1, two fully-convolutional layers with kernel 3 3 and stride of 2, and another three fully-convolutional layers with kernel 1 1 and stride of 1 to adjust channels of different feature maps. Level embedding module is defined as metrics with shape 3 dims. The prompt Interactor module contains three fully-convolutional layers with kernel 3 3 and stride of 2 to adjust feature dimensions.



64f1f27bf1b4ec22924fd0acb550c235-Paper.pdf

Neural Information Processing Systems

The proposed MLP decoder aggregates information from different layers, andthus combining both local attention and global attention to render powerful representations.



Supplementary Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments Thanh-Dat Truong

Neural Information Processing Systems

Contrastive Clustering loss and update the prototypical vectors.Algorithm 1: Prototypical Constrative Clustering Loss Compute Prototypical Constrative Clustering Loss based on Eqn. Compute Prototypical Constrative Clustering Loss based on Eqn. Two segmentation network architectures have been used in our experiments, i.e., (1) DeepLab-V3 The learning rate is set individually for each step and dataset. Similarly, to illustrate the effectiveness and robustness of our method in the non-incremental setting. We also perform an additional ablation study on the ADE20K (100-50) benchmark to investigate the impact of the delta.