Goto

Collaborating Authors

 Jiang, Jun


SplitFrozen: Split Learning with Device-side Model Frozen for Fine-Tuning LLM on Heterogeneous Resource-Constrained Devices

arXiv.org Artificial Intelligence

Fine-tuning large language models (LLMs) on private, on-device data can empower tailored personalized AI agents. However, fine-tuning LLMs on resource-constrained edge devices faces significant challenges, including excessive computation overhead, device heterogeneity, and data imbalance. This paper proposes SplitFrozen, a split learning framework that enables efficient LLM fine-tuning by strategically freezing device-side model layers while centralizing parameter-efficient fine-tuning on the server. Our framework partitions LLMs into device-side frozen layers and server-side fine-tuning layers, where heterogeneous resource-constrained devices execute only forward propagation. To minimize server-side training costs, we integrate Low-Rank Adaptation (LoRA) into the server-side layers. A pipeline parallelism strategy further optimizes training efficiency by decoupling device-server computations and leveraging decomposed backward propagation. Experiments on GPT-2 with the MRPC, MNLI-matched, and SST-2 datasets demonstrate that SplitFrozen outperforms FedLoRA and SplitLoRA by 69.4\% model accuracy under extremely imbalanced data, while reducing up to 86.8\% device-side computations and 50.2\% total training time. Experiments also validate the scalability of SplitFrozen on content generation task using Llama-3.2 model on GSM8K dataset.


A MIMO Wireless Channel Foundation Model via CIR-CSI Consistency

arXiv.org Artificial Intelligence

In the field of artificial intelligence, self-supervised learning has demonstrated superior generalization capabilities by leveraging large-scale unlabeled datasets for pretraining, which is especially critical for wireless communication models to adapt to a variety of scenarios. This paper innovatively treats Channel State Information (CSI) and Channel Impulse Response (CIR) as naturally aligned multi-modal data and proposes the first MIMO wireless channel foundation model, named CSI-CLIP. By effectively capturing the joint representations of both CIR and CSI, CSI-CLIP exhibits remarkable adaptability across scenarios and robust feature extraction capabilities. Experimental results show that in positioning task, CSI-CLIP reduces the mean error distance by 22%; in beam management task, it increases accuracy by 1% compared to traditional supervised methods, as well as in the channel identification task. These improvements not only highlight the potential and value of CSI-CLIP in integrating sensing and communication but also demonstrate its significant advantages over existing techniques. Moreover, viewing CSI and CIR as multi-modal pairs and contrastive learning for wireless channel foundation model open up new research directions in the domain of MIMO wireless communications.


Deep Reinforcement Learning-Based Bidding Strategies for Prosumers Trading in Double Auction-Based Transactive Energy Market

arXiv.org Artificial Intelligence

--With the large number of prosumers deploying distributed energy resources (DERs), integrating these prosumers into a transactive energy market (TEM) is a trend for the future smart grid. A community-based double auction market is considered a promising TEM that can encourage prosumers to participate and maximize social welfare. However, the traditional TEM is challenging to model explicitly due to the random bidding behavior of prosumers and uncertainties caused by the energy operation of DERs. Furthermore, although reinforcement learning algorithms provide a model-free solution to optimize prosumers' bidding strategies, their use in TEM is still challenging due to their scalability, stability, and privacy protection limitations. T o address the above challenges, in this study, we design a double auction-based TEM with multiple DERs-equipped prosumers to transparently and efficiently manage energy transactions. We also propose a deep reinforcement learning (DRL) model with distributed learning and execution to ensure the scalability and privacy of the market environment. Simulation results show that (1) the designed TEM and DRL model are robust; (2) the proposed DRL model effectively balances the energy payment and comfort satisfaction for prosumers and outperforms the state-of-the-art methods in optimizing the bidding strategies. ITH the extensive deployment of energy storage systems, solar photovoltaics (PVs), smart home appliances, and information technology, passive consumers in the traditional electricity market are gradually converted to active prosumers (producers + consumers) with distributed energy resources (DERs), who can monitor and control energy generation, consumption, storage, and transaction to achieve specific goals, such as balancing energy costs and user comfort levels [1]-[3]. However, the bi-directional energy and information flow, as well as the variability of distributed renewable energy, raises great challenges in the operation of power systems in a flexible and economically efficient way [4]. Liu are with the Department of Computer Science and Engineering, Santa Clara University, Santa Clara, CA, USA (e-mail: jun3525114@gmail.com, Li, M. Ghafouri, and J. Y an are with Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada (e-mail: {yuanliang.li, L. Hou is with Beijing University of Posts and Telecommunications, Beijing, China (e-mail: luyang.hou@bupt.edu.cn) Zhang is with the College of Information Engineering, Shenzhen University, Shenzhen, China (e-mail: zhangp@szu.edu.cn)


Multimodal Alignment of Histopathological Images Using Cell Segmentation and Point Set Matching for Integrative Cancer Analysis

arXiv.org Artificial Intelligence

Abstract: Histopathological imaging is vital for cancer research and clinical practice, with multiplexed Immunofluorescence (MxIF) and Hematoxylin and Eosin (H&E) providing complementary insights. However, aligning different stains at the cell level remains a challenge due to modality differences. In this paper, we present a novel framework for multimodal image alignment using cell segmentation outcomes. By treating cells as point sets, we apply Coherent Point Drift (CPD) for initial alignment and refine it with Graph Matching (GM). Evaluated on ovarian cancer tissue microarrays (TMAs), our method achieves high alignment accuracy, enabling integration of cell-level features across modalities and generating virtual H&E images from MxIF data for enhanced clinical interpretation. Keywords: Histopathology alignment, Histopathology registration, Bioimage analysis Introduction: As an important approach to reveal cell level details in cancer, histopathological images have been widely used in both clinic practice for diagnostic decision making and treatment follow up. Following different staining protocols, each modality of histopathology has its unique strength in highlighting specific aspects within tumor immune microenvironment (TIME). Among which, multiplexed Immunofluorescence (MxIF) images provide refined immune cell phenotyping, making it a favorable research tool for revealing cell behaviors in TIME. However, this imaging technique now is mainly used for research purposes due to the low reliability of marker signals caused by complex cyclic staining processes. On the other hand, H&E (Hematoxylin and Eosin) staining plays an irreplaceable role in providing standard clinical references by revealing cell morphology and texture patterns.


Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

arXiv.org Artificial Intelligence

Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological interpretability. To overcome these limitations, we introduce VETE (Variational and Explanatory Transcriptomics Encoder), a novel neural network framework that incorporates a variational component to mitigate noise effects and integrates traceable gene ontology into the neural network architecture for encoding cancer transcriptomics data. Key innovations include a local interpretability-guided method for identifying ontology paths, a visualization tool to elucidate biological mechanisms of drug responses, and the application of centralized large scale hyperparameter optimization. VETE demonstrated robust accuracy in cancer cell line classification and drug response prediction. Additionally, it provided traceable biological explanations for both tasks and offers insights into the mechanisms underlying its predictions. VETE bridges the gap between AI-driven predictions and biologically meaningful insights in cancer research, which represents a promising advancement in the field.


InstructPipe: Building Visual Programming Pipelines with Human Instructions

arXiv.org Artificial Intelligence

Visual programming provides beginner-level programmers with a coding-free experience to build their customized pipelines. Existing systems require users to build a pipeline entirely from scratch, implying that novice users need to set up and link appropriate nodes all by themselves, starting from a blank workspace. We present InstructPipe, an AI assistant that enables users to start prototyping machine learning (ML) pipelines with text instructions. We designed two LLM modules and a code interpreter to execute our solution. LLM modules generate pseudocode of a target pipeline, and the interpreter renders a pipeline in the node-graph editor for further human-AI collaboration. Technical evaluations reveal that InstructPipe reduces user interactions by 81.1% compared to traditional methods. Our user study (N=16) showed that InstructPipe empowers novice users to streamline their workflow in creating desired ML pipelines, reduce their learning curve, and spark innovative ideas with open-ended commands.