different scenario
Watermarks for Embeddings-as-a-Service Large Language Models
Large Language Models (LLMs) have demonstrated exceptional capabilities in natural language understanding and generation. Based on these LLMs, businesses have started to provide Embeddings-as-a-Service (EaaS), offering feature extraction capabilities (in the form of text embeddings) that benefit downstream natural language processing tasks. However, prior research has demonstrated that EaaS is vulnerable to imitation attacks, where an attacker clones the service's model in a black-box manner without access to the model's internal workings. In response, watermarks have been added to the text embeddings to protect the intellectual property of EaaS providers by allowing them to check for model ownership. This thesis focuses on defending against imitation attacks by investigating EaaS watermarks. To achieve this goal, we unveil novel attacks and propose and validate new watermarking techniques. Firstly, we show that existing EaaS watermarks can be removed through paraphrasing the input text when attackers clone the model during imitation attacks. Our study illustrates that paraphrasing can effectively bypass current state-of-the-art EaaS watermarks across various attack setups (including different paraphrasing techniques and models) and datasets in most instances. This demonstrates a new vulnerability in recent EaaS watermarking techniques. Subsequently, as a countermeasure, we propose a novel watermarking technique, WET (Watermarking EaaS with Linear Transformation), which employs linear transformation of the embeddings. Watermark verification is conducted by applying a reverse transformation and comparing the similarity between recovered and original embeddings. We demonstrate its robustness against paraphrasing attacks with near-perfect verifiability. We conduct detailed ablation studies to assess the significance of each component and hyperparameter in WET.
Multimodal Continual Learning with MLLMs from Multi-scenario Perspectives
Jiang, Kai, Huang, Siqi, Chen, Xiangyu, Shao, Jiawei, Zhang, Hongyuan, Li, Xuelong
Continual learning in visual understanding aims to deal with catastrophic forgetting in Multimodal Large Language Models (MLLMs). MLLMs deployed on devices have to continuously adapt to dynamic scenarios in downstream tasks, such as variations in background and perspective, to effectively perform complex visual tasks. To this end, we construct a multimodal visual understanding dataset (MSVQA) encompassing four different scenarios and perspectives including high altitude, underwater, low altitude and indoor, to investigate the catastrophic forgetting in MLLMs under the dynamics of scenario shifts in real-world data streams. Furthermore, we propose mUltimodal coNtInual learning with MLLMs From multi-scenarIo pERspectives (UNIFIER) to address visual discrepancies while learning different scenarios. Specifically, it decouples the visual information from different scenarios into distinct branches within each vision block and projects them into the same feature space. A consistency constraint is imposed on the features of each branch to maintain the stability of visual representations across scenarios. Extensive experiments on the MSVQA dataset demonstrate that UNIFIER effectively alleviates forgetting of cross-scenario tasks and achieves knowledge accumulation within the same scenario.
PG-Agent: An Agent Powered by Page Graph
Chen, Weizhi, Wang, Ziwei, Yang, Leyang, Zhou, Sheng, Tang, Xiaoxuan, Bu, Jiajun, Li, Yong, Jiang, Wei
Graphical User Interface (GUI) agents possess significant commercial and social value, and GUI agents powered by advanced multimodal large language models (MLLMs) have demonstrated remarkable potential. Currently, existing GUI agents usually utilize sequential episodes of multi-step operations across pages as the prior GUI knowledge, which fails to capture the complex transition relationship between pages, making it challenging for the agents to deeply perceive the GUI environment and generalize to new scenarios. Therefore, we design an automated pipeline to transform the sequential episodes into page graphs, which explicitly model the graph structure of the pages that are naturally connected by actions. To fully utilize the page graphs, we further introduce Retrieval-Augmented Generation (RAG) technology to effectively retrieve reliable perception guidelines of GUI from them, and a tailored multi-agent framework PG-Agent with task decomposition strategy is proposed to be injected with the guidelines so that it can generalize to unseen scenarios. Extensive experiments on various benchmarks demonstrate the effectiveness of PG-Agent, even with limited episodes for page graph construction.
Study of Robust Features in Formulating Guidance for Heuristic Algorithms for Solving the Vehicle Routing Problem
Herdianto, Bachtiar, Billot, Romain, Lucas, Flavien, Sevaux, Marc
Combinatorial optimization problems, such as Vehicle Routing Problems (VRP), are important in real-world applications as they search for efficient solutions to minimize costs. Despite extensive research over decades, achieving optimal solutions remains a challenge (Laporte, 2009). Furthermore, the unique constraints of various problem variants demand specialized algorithms. The development of these algorithms is complex, making Machine Learning (ML) an attractive approach to improving the existing algorithms. Routing algorithms are typically divided into two categories: exact algorithms that offer global optimum but require many computational resources and heuristics methods for practical, real-world applications that mostly find a near-optimal solution. While most heuristics rely on human-designed strategies (Lucas et al., 2020), ML offers a new approach improving algorithm. Moreover, the selection of features influenced by these ML models plays a critical role in effectively enhancing heuristic performances (Arnold and S orensen, 2019b; Arnold and S orensen, 2019a; Lucas, Billot, and Sevaux, 2019). Understanding the predictions of an ML model can be as crucial as the accuracy of the prediction itself in many applications (Lundberg and Lee, 2017).
PERSCEN: Learning Personalized Interaction Pattern and Scenario Preference for Multi-Scenario Matching
Du, Haotong, Wang, Yaqing, Xiong, Fei, Shao, Lei, Liu, Ming, Gu, Hao, Yao, Quanming, Wang, Zhen
With the expansion of business scales and scopes on online platforms, multi-scenario matching has become a mainstream solution to reduce maintenance costs and alleviate data sparsity. The key to effective multi-scenario recommendation lies in capturing both user preferences shared across all scenarios and scenario-aware preferences specific to each scenario. However, existing methods often overlook user-specific modeling, limiting the generation of personalized user representations. To address this, we propose PERSCEN, an innovative approach that incorporates user-specific modeling into multi-scenario matching. PERSCEN constructs a user-specific feature graph based on user characteristics and employs a lightweight graph neural network to capture higher-order interaction patterns, enabling personalized extraction of preferences shared across scenarios. Additionally, we leverage vector quantization techniques to distil scenario-aware preferences from users' behavior sequence within individual scenarios, facilitating user-specific and scenario-aware preference modeling. To enhance efficient and flexible information transfer, we introduce a progressive scenario-aware gated linear unit that allows fine-grained, low-latency fusion. Extensive experiments demonstrate that PERSCEN outperforms existing methods. Further efficiency analysis confirms that PERSCEN effectively balances performance with computational cost, ensuring its practicality for real-world industrial systems.
Reinforcing User Interest Evolution in Multi-Scenario Learning for recommender systems
Feng, Zhijian, Zheng, Wenhao, Xiao, Xuanji
In real-world recommendation systems, users would engage in variety scenarios, such as homepages, search pages, and related recommendation pages. Each of these scenarios would reflect different aspects users focus on. However, the user interests may be inconsistent in different scenarios, due to differences in decision-making processes and preference expression. This variability complicates unified modeling, making multi-scenario learning a significant challenge. To address this, we propose a novel reinforcement learning approach that models user preferences across scenarios by modeling user interest evolution across multiple scenarios. Our method employs Double Q-learning to enhance next-item prediction accuracy and optimizes contrastive learning loss using Q-value to make model performance better. Experimental results demonstrate that our approach surpasses state-of-the-art methods in multi-scenario recommendation tasks. Our work offers a fresh perspective on multi-scenario modeling and highlights promising directions for future research.
Adaptive Attention-Based Model for 5G Radio-based Outdoor Localization
Yaman, Ilayda, Tian, Guoda, Tufvesson, Fredrik, Edfors, Ove, Zhang, Zhengya, Liu, Liang
Radio-based localization in dynamic environments, such as urban and vehicular settings, requires systems that can efficiently adapt to varying signal conditions and environmental changes. Factors such as multipath interference and obstructions introduce different levels of complexity that affect the accuracy of the localization. Although generalized models offer broad applicability, they often struggle to capture the nuances of specific environments, leading to suboptimal performance in real-world deployments. In contrast, specialized models can be tailored to particular conditions, enabling more precise localization by effectively handling domain-specific variations and noise patterns. However, deploying multiple specialized models requires an efficient mechanism to select the most appropriate one for a given scenario. In this work, we develop an adaptive localization framework that combines shallow attention-based models with a router/switching mechanism based on a single-layer perceptron (SLP). This enables seamless transitions between specialized localization models optimized for different conditions, balancing accuracy, computational efficiency, and robustness to environmental variations. We design three low-complex localization models tailored for distinct scenarios, optimized for reduced computational complexity, test time, and model size. The router dynamically selects the most suitable model based on real-time input characteristics. The proposed framework is validated using real-world vehicle localization data collected from a massive MIMO base station (BS), demonstrating its ability to seamlessly adapt to diverse deployment conditions while maintaining high localization accuracy.
Exploring the Panorama of Anxiety Levels: A Multi-Scenario Study Based on Human-Centric Anxiety Level Detection and Personalized Guidance
Faculty of Computer Science and Information Technology, University of Malaya, Malaysia Abstract More and more people are under p ressure from work, life and education. Under these pressures, people will develop an anxious state of mind, or even the initial symptoms of suicide. With the advancement of artificial intelligence technology,large language modeling is currently one of the hottest technologies. It is often used for detecting psychological disorders, however, the current study only gives the categorization result, but does not give an interpretable description of what led to this categorization result. Based on all these imma ture studies, this study adopts a person - centered perspective and focuses on GPT - generated multi - scenario simulated conversations. These simulated conversations were selected as data samples for the study. Various transformer - based encoder models were util ized in the study in order to integrate a classification model capable of identifying different anxiety levels. In addition, a knowledge base focusing on anxiety was constructed in this study using Langchain and GPT4. When analyzing the classification resu lts, this knowledge base was able to provide explanations and reasons that were most relevant to the interlocutor's anxiety situation. The study shows that the developed model achieves more than 94% accuracy in categorical prediction and that the advice pr ovided is highly personalized. Mental health is defined as a state of well - being on the mental, emotional, and social levels [8, 16, 34]. Abnormal anxiety is a very important factor that leads to mental health [3, 19, 43].