Goto

Collaborating Authors

 cola


COLA: Towards Efficient Multi-Objective Reinforcement Learning with Conflict Objective Regularization in Latent Space

Neural Information Processing Systems

Many real-world control problems require continual policy adjustments to balance multiple objectives, which requires the acquisition of high-quality policies to cover diverse preferences. Multi-Objective Reinforcement Learning (MORL) provides a general framework to solve such problems. However, current MORL methods suffer from high sample complexity, primarily due to the neglect of efficient knowledge sharing and conflicts in optimization with different preferences. To this end, this paper introduces a novel framework, Conflict Objective Regularization in Latent Space (COLA). To enable efficient knowledge sharing, COLA establishes a shared latent representation space for common knowledge, which can avoid redundant learning under different preferences. Besides, COLA introduces a regularization term for the value function to mitigate the negative effects of conflicting preferences on the value function approximation, thereby improving the accuracy of value estimation. The experimental results across various multi-objective continuous control tasks demonstrate the significant superiority of COLA over the state-of-the-art MORL baselines. Code is available at https://github.com/yeshenpy/COLA.


COLA: Towards Efficient Multi-Objective Reinforcement Learning with Conflict Objective Regularization in Latent Space

Neural Information Processing Systems

Many real-world control problems require continual policy adjustments to balance multiple objectives, which requires the acquisition of high-quality policies to cover diverse preferences. Multi-Objective Reinforcement Learning (MORL) provides a general framework to solve such problems. However, current MORL methods suffer from high sample complexity, primarily due to the neglect of efficient knowledge sharing and conflicts in optimization with different preferences.


Enhancing CLIP Robustness via Cross-Modality Alignment

Neural Information Processing Systems

Vision-language models (VLMs) such as CLIP demonstrate strong generalization in zero-shot classification but remain highly vulnerable to adversarial perturbations. Existing methods primarily focus on adversarial fine-tuning or prompt optimization, they often overlook the gaps in CLIP's encoded features, which is shown as the text and image features lie far apart from each other. This misalignment is significantly amplified under adversarial perturbations, leading to severe degradation in classification performance.


0346c148ba1c21c6b4780a961ea141dc-Supplemental-Conference.pdf

Neural Information Processing Systems

Table 7: Extensions of Table 1 with more details of prompts used to generate class-conditioned texts for different GLUE tasks. SST-2 and CoLA are single-sequence classification tasks and the rest are sequence-pair classification tasks. Generation for CoLA does not use prompts but by varying sampling temperatures. Text generation with CTRL [23] requires starting with control codes, and we use the ones that correspond to the pretraining corpus where the first sequence is sampled: For MNLI, RTE and MRPC, the first sequence is sampled from Wikipedia; for QNLI and QQP, the first sequence is sampled from OpenWebText [17]. The prompts used for SST-2 are part of the CTRL [23] codes. Furthermore, xg contradiction There is a rumor that xs.