Not enough data to create a plot.
Try a different view from the menu above.
Yin, Quanjun
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
Liu, Ting, Xu, Zunnan, Hu, Yue, Shi, Liangtao, Wang, Zhiqiang, Yin, Quanjun
Referring Expression Comprehension (REC), which aims to ground a local visual region via natural language, is a task that heavily relies on multimodal alignment. Most existing methods utilize powerful pre-trained models to transfer visual/linguistic knowledge by full fine-tuning. However, full fine-tuning the entire backbone not only breaks the rich prior knowledge embedded in the pre-training, but also incurs significant computational costs. Motivated by the recent emergence of Parameter-Efficient Transfer Learning (PETL) methods, we aim to solve the REC task in an effective and efficient manner. Directly applying these PETL methods to the REC task is inappropriate, as they lack the specific-domain abilities for precise local visual perception and visual-language alignment. Therefore, we propose a novel framework of Multimodal Prior-guided Parameter Efficient Tuning, namely MaPPER. Specifically, MaPPER comprises Dynamic Prior Adapters guided by an aligned prior, and Local Convolution Adapters to extract precise local semantics for better visual perception. Moreover, the Prior-Guided Text module is proposed to further utilize the prior for facilitating the cross-modal alignment. Experimental results on three widely-used benchmarks demonstrate that MaPPER achieves the best accuracy compared to the full fine-tuning and other PETL methods with only 1.41% tunable backbone parameters. Our code is available at https://github.com/liuting20/MaPPER.
Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration
Li, Qinglun, Zhang, Miao, Liu, Yingqi, Yin, Quanjun, Shen, Li, Cao, Xiaochun
--Decentralized Federated Learning has emerged as an alternative to centralized architectures due to its faster training, privacy preservation, and reduced communication overhead. In decentralized communication, the server aggregation phase in Centralized Federated Learning shifts to the client side, which means that clients connect with each other in a peer-to-peer manner . However, compared to the centralized mode, data heterogeneity in Decentralized Federated Learning will cause larger variances between aggregated models, which leads to slow convergence in training and poor generalization performance in tests. T o address these issues, we introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata. It consists of two main components: the Moreau envelope function, which primarily addresses parameter inconsistencies among clients caused by data heterogeneity, and Nesterov's extrapolation step, which accelerates the aggregation phase. Theoretically, We prove the optimization error bound and generalization error bound of the algorithm, providing a further understanding of the nature of the algorithm and the theoretical perspectives on the hyperparameter choice. Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIF AR10/100 with various non-iid data distributions. Furthermore, we also experimentally verify the theoretical properties of DFedCata. EDERA TED Learning (FL) is a new distributed machine learning paradigm that prioritizes privacy protection [1]- [3]. It enables multiple clients to collaborate on training models without sharing their raw data. Nowadays, much of the research [4]-[9] focus on Centralized Federated Learning (CFL), but the central server in CFL brings various challenges on communication burden, single point of failure [10], privacy breaches [11] and so on. In contrast, Decentralized Federated Learning (DFL) centralizes both the local update and aggregation steps on the client, which offers enhanced privacy protection [12], faster model training [13], and robustness to slow client devices [14]. Therefore, DFL has become a popular alternative solution [10], [13]. Qinglun Li, Miao Zhang, and Quanjun Yin are with the College of Systems Engineering, National University of Defense Technology. Yingqi Liu, Li Shen, and Xiaochun Cao are with the School of Cy-ber Science and Technology, Shenzhen Campus of Sun Y at-sen University, Shenzhen 518107, China. The optimization process diagrams for two clients under the DFedAvg and DFedCata algorithms are simulated. The primary improvements include two aspects.
OledFL: Unleashing the Potential of Decentralized Federated Learning via Opposite Lookahead Enhancement
Li, Qinglun, Zhang, Miao, Wang, Mengzhu, Yin, Quanjun, Shen, Li
Decentralized Federated Learning (DFL) surpasses Centralized Federated Learning (CFL) in terms of faster training, privacy preservation, and light communication, making it a promising alternative in the field of federated learning. However, DFL still exhibits significant disparities with CFL in terms of generalization ability such as rarely theoretical understanding and degraded empirical performance due to severe inconsistency. In this paper, we enhance the consistency of DFL by developing an opposite lookahead enhancement technique (Ole), yielding OledFL to optimize the initialization of each client in each communication round, thus significantly improving both the generalization and convergence speed. Moreover, we rigorously establish its convergence rate in non-convex setting and characterize its generalization bound through uniform stability, which provides concrete reasons why OledFL can achieve both the fast convergence speed and high generalization ability. Extensive experiments conducted on the CIFAR10 and CIFAR100 datasets with Dirichlet and Pathological distributions illustrate that our OledFL can achieve up to 5\% performance improvement and 8$\times$ speedup, compared to the most popular DFedAvg optimizer in DFL.
Conversational Crowdsensing: A Parallel Intelligence Powered Novel Sensing Approach
Zhu, Zhengqiu, Zhao, Yong, Chen, Bin, Qiu, Sihang, Xu, Kai, Yin, Quanjun, Huang, Jincai, Liu, Zhong, Wang, Fei-Yue
The transition from CPS-based Industry 4.0 to CPSS-based Industry 5.0 brings new requirements and opportunities to current sensing approaches, especially in light of recent progress in Chatbots and Large Language Models (LLMs). Therefore, the advancement of parallel intelligence-powered Crowdsensing Intelligence (CSI) is witnessed, which is currently advancing towards linguistic intelligence. In this paper, we propose a novel sensing paradigm, namely conversational crowdsensing, for Industry 5.0. It can alleviate workload and professional requirements of individuals and promote the organization and operation of diverse workforce, thereby facilitating faster response and wider popularization of crowdsensing systems. Specifically, we design the architecture of conversational crowdsensing to effectively organize three types of participants (biological, robotic, and digital) from diverse communities. Through three levels of effective conversation (i.e., inter-human, human-AI, and inter-AI), complex interactions and service functionalities of different workers can be achieved to accomplish various tasks across three sensing phases (i.e., requesting, scheduling, and executing). Moreover, we explore the foundational technologies for realizing conversational crowdsensing, encompassing LLM-based multi-agent systems, scenarios engineering and conversational human-AI cooperation. Finally, we present potential industrial applications of conversational crowdsensing and discuss its implications. We envision that conversations in natural language will become the primary communication channel during crowdsensing process, enabling richer information exchange and cooperative problem-solving among humans, robots, and AI.
The Risk of Federated Learning to Skew Fine-Tuning Features and Underperform Out-of-Distribution Robustness
Du, Mengyao, Zhang, Miao, Pu, Yuwen, Xu, Kai, Ji, Shouling, Yin, Quanjun
To tackle the scarcity and privacy issues associated with domain-specific datasets, the integration of federated learning in conjunction with fine-tuning has emerged as a practical solution. However, our findings reveal that federated learning has the risk of skewing fine-tuning features and compromising the out-of-distribution robustness of the model. By introducing three robustness indicators and conducting experiments across diverse robust datasets, we elucidate these phenomena by scrutinizing the diversity, transferability, and deviation within the model feature space. To mitigate the negative impact of federated learning on model robustness, we introduce GNP, a \underline{G}eneral \underline{N}oisy \underline{P}rojection-based robust algorithm, ensuring no deterioration of accuracy on the target distribution. Specifically, the key strategy for enhancing model robustness entails the transfer of robustness from the pre-trained model to the fine-tuned model, coupled with adding a small amount of Gaussian noise to augment the representative capacity of the model. Comprehensive experimental results demonstrate that our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods and confronting different levels of data heterogeneity.
Asymmetrically Decentralized Federated Learning
Li, Qinglun, Zhang, Miao, Yin, Nan, Yin, Quanjun, Shen, Li
To address the communication burden and privacy concerns associated with the centralized server in Federated Learning (FL), Decentralized Federated Learning (DFL) has emerged, which discards the server with a peer-to-peer (P2P) communication framework. However, most existing DFL algorithms are based on symmetric topologies, such as ring and grid topologies, which can easily lead to deadlocks and are susceptible to the impact of network link quality in practice. To address these issues, this paper proposes the DFedSGPSM algorithm, which is based on asymmetric topologies and utilizes the Push-Sum protocol to effectively solve consensus optimization problems. To further improve algorithm performance and alleviate local heterogeneous overfitting in Federated Learning (FL), our algorithm combines the Sharpness Aware Minimization (SAM) optimizer and local momentum. The SAM optimizer employs gradient perturbations to generate locally flat models and searches for models with uniformly low loss values, mitigating local heterogeneous overfitting. The local momentum accelerates the optimization process of the SAM optimizer. Theoretical analysis proves that DFedSGPSM achieves a convergence rate of $\mathcal{O}(\frac{1}{\sqrt{T}})$ in a non-convex smooth setting under mild assumptions. This analysis also reveals that better topological connectivity achieves tighter upper bounds. Empirically, extensive experiments are conducted on the MNIST, CIFAR10, and CIFAR100 datasets, demonstrating the superior performance of our algorithm compared to state-of-the-art optimizers.
DFedADMM: Dual Constraints Controlled Model Inconsistency for Decentralized Federated Learning
Li, Qinglun, Shen, Li, Li, Guanghao, Yin, Quanjun, Tao, Dacheng
To address the communication burden issues associated with federated learning (FL), decentralized federated learning (DFL) discards the central server and establishes a decentralized communication network, where each client communicates only with neighboring clients. However, existing DFL methods still suffer from two major challenges: local inconsistency and local heterogeneous overfitting, which have not been fundamentally addressed by existing DFL methods. To tackle these issues, we propose novel DFL algorithms, DFedADMM and its enhanced version DFedADMM-SAM, to enhance the performance of DFL. The DFedADMM algorithm employs primal-dual optimization (ADMM) by utilizing dual variables to control the model inconsistency raised from the decentralized heterogeneous data distributions. The DFedADMM-SAM algorithm further improves on DFedADMM by employing a Sharpness-Aware Minimization (SAM) optimizer, which uses gradient perturbations to generate locally flat models and searches for models with uniformly low loss values to mitigate local heterogeneous overfitting. Theoretically, we derive convergence rates of $\small \mathcal{O}\Big(\frac{1}{\sqrt{KT}}+\frac{1}{KT(1-\psi)^2}\Big)$ and $\small \mathcal{O}\Big(\frac{1}{\sqrt{KT}}+\frac{1}{KT(1-\psi)^2}+ \frac{1}{T^{3/2}K^{1/2}}\Big)$ in the non-convex setting for DFedADMM and DFedADMM-SAM, respectively, where $1 - \psi$ represents the spectral gap of the gossip matrix. Empirically, extensive experiments on MNIST, CIFAR10 and CIFAR100 datesets demonstrate that our algorithms exhibit superior performance in terms of both generalization and convergence speed compared to existing state-of-the-art (SOTA) optimizers in DFL.
Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder
Zeng, Junjie, Qin, Long, Hu, Yue, Hu, Cong, Yin, Quanjun
In this paper, we present a hierarchical path planning framework called SG-RL (subgoal graphs-reinforcement learning), to plan rational paths for agents maneuvering in continuous and uncertain environments. By "rational", we mean (1) efficient path planning to eliminate first-move lags; (2) collision-free and smooth for agents with kinematic constraints satisfied. SG-RL works in a two-level manner. At the first level, SG-RL uses a geometric path-planning method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract paths, also called subgoal sequences. At the second level, SG-RL uses an RL method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal motion-planning policies which can generate kinematically feasible and collision-free trajectories between adjacent subgoals. The first advantage of the proposed method is that SSG can solve the limitations of sparse reward and local minima trap for RL agents; thus, LSPI can be used to generate paths in complex environments. The second advantage is that, when the environment changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI can deal with uncertainties by exploiting its generalization ability to handle changes in environments. Simulation experiments in representative scenarios demonstrate that, compared with existing methods, SG-RL can work well on large-scale maps with relatively low action-switching frequencies and shorter path lengths, and SG-RL can deal with small changes in environments. We further demonstrate that the design of reward functions and the types of training environments are important factors for learning feasible policies.
Inverse Reinforcement Learning Based Human Behavior Modeling for Goal Recognition in Dynamic Local Network Interdiction
Zeng, Yunxiu (National University of Defense Technology) | Xu, Kai (National University of Defense Technology) | Yin, Quanjun (National University of Defense Technology) | Qin, Long (National University of Defense Technology) | Zha, Yabing (National University of Defense Technology) | Yeoh, William (Washington University in St. Louis)
Goal recognition is the task of inferring an agent's goals given some or all of the agent’s observed actions. Among different ways of problem formulation, goal recognition can be solved as a model-based planning problem using off-the-shell planners. However, obtaining accurate cost or reward models of an agent and incorporating them into the planning model becomes an issue in real applications. Towards this end, we propose an Inverse Reinforcement Learning (IRL)-based opponent behavior modeling method, and apply it in the goal recognition assisted Dynamic Local Network Interdiction (DLNI) problem. We first introduce the overall framework and the DLNI problem domain of our work. After that, an IRL-based human behavior modeling method and Markov Decision Process-based goal recognition are introduced. Experimental results indicate that our learned behavior model has a higher tracking accuracy and yields better interdiction outcomes than other models.