Goto

Collaborating Authors

 Shen, Yanyan


Improving the End-to-End Efficiency of Offline Inference for Multi-LLM Applications Based on Sampling and Simulation

arXiv.org Artificial Intelligence

As large language models (LLMs) have shown great success in many tasks, they are used in various applications. While a lot of works have focused on the efficiency of single-LLM application (e.g., offloading, request scheduling, parallelism strategy selection), multi-LLM applications receive less attention, particularly in offline inference scenarios. In this work, we aim to improve the offline end-to-end inference efficiency of multi-LLM applications in the single-node multi-GPU environment. The problem involves two key decisions: (1) determining which LLMs to run concurrently each time (we may not run all the models at the same time), and (2) selecting a parallelism strategy to use for each LLM. This problem is NP-hard. Naive solutions may not work well because the running time for a model to complete a set of requests depends on the request workload and the selected parallelism strategy, and they lack an accurate model of the running time. As the LLM output lengths are unknown before running, to estimate the model running time, we propose a sampling-then-simulation method which first estimates the output lengths by sampling from an empirical cumulative function we obtained from a large dataset in advance, and then simulates the LLM inference process accordingly. Based on the simulation, we estimate the per-iteration latencys to get the total latency. A greedy method is proposed to optimize the scheduling of the LLMs in the application across the GPUs. We then propose a framework SamuLLM which contains two phases: planning, which calls the greedy method for an application and running, which runs the application and dynamically adjust the model scheduling based on the runtime information. Experiments on 3 applications and a mixed application show that SamuLLM can achieve 1.0-2.4$\times$ end-to-end speedups compared to the competitors.


Proactive Model Adaptation Against Concept Drift for Online Time Series Forecasting

arXiv.org Machine Learning

Time series forecasting always faces the challenge of concept drift, where data distributions evolve over time, leading to a decline in forecast model performance. Existing solutions are based on online learning, which continually organize recent time series observations as new training samples and update model parameters according to the forecasting feedback on recent data. However, they overlook a critical issue: obtaining ground-truth future values of each sample should be delayed until after the forecast horizon. This delay creates a temporal gap between the training samples and the test sample. Our empirical analysis reveals that the gap can introduce concept drift, causing forecast models to adapt to outdated concepts. In this paper, we present \textsc{Proceed}, a novel proactive model adaptation framework for online time series forecasting. \textsc{Proceed} first estimates the concept drift between the recently used training samples and the current test sample. It then employs an adaptation generator to efficiently translate the estimated drift into parameter adjustments, proactively adapting the model to the test sample. To enhance the generalization capability of the framework, \textsc{Proceed} is trained on synthetic diverse concept drifts. Extensive experiments on five real-world datasets across various forecast models demonstrate that \textsc{Proceed} brings more performance improvements than the state-of-the-art online learning methods, significantly facilitating forecast models' resilience against concept drifts. Code is available at \url{https://github.com/SJTU-DMTai/OnlineTSF}.


When UAV Meets Federated Learning: Latency Minimization via Joint Trajectory Design and Resource Allocation

arXiv.org Artificial Intelligence

Federated learning (FL) has emerged as a pivotal solution for training machine learning models over wireless networks, particularly for Internet of Things (IoT) devices with limited computation resources. Despite its benefits, the efficiency of FL is often restricted by the communication quality between IoT devices and the central server. To address this issue, we introduce an innovative approach by deploying an unmanned aerial vehicle (UAV) as a mobile FL server to enhance the training process of FL. By leveraging the UAV's maneuverability, we establish robust line-of-sight connections with IoT devices, significantly improving communication capacity. To improve the overall training efficiency, we formulate a latency minimization problem by jointly optimizing the bandwidth allocation, computing frequencies, transmit power for both the UAV and IoT devices, and the UAV's trajectory. Then, an efficient alternating optimization algorithm is developed to solve it efficiently. Furthermore, we analyze the convergence and computational complexity of the proposed algorithm. Finally, numerical results demonstrate that our proposed scheme not only outperforms existing benchmark schemes in terms of latency but also achieves training efficiency that closely approximate the ideal scenario.


NeurDB: An AI-powered Autonomous Data System

arXiv.org Artificial Intelligence

In the wake of rapid advancements in artificial intelligence (AI), we stand on the brink of a transformative leap in data systems. The imminent fusion of AI and DB (AIxDB) promises a new generation of data systems, which will relieve the burden on end-users across all industry sectors by featuring AI-enhanced functionalities, such as personalized and automated in-database AI-powered analytics, self-driving capabilities for improved system performance, etc. In this paper, we explore the evolution of data systems with a focus on deepening the fusion of AI and DB. We present NeurDB, an AI-powered autonomous data system designed to fully embrace AI design in each major system component and provide in-database AI-powered analytics. We outline the conceptual and architectural overview of NeurDB, discuss its design choices and key components, and report its current development and future plan.


Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition

arXiv.org Artificial Intelligence

Emotion recognition based on Electroencephalography (EEG) has gained significant attention and diversified development in fields such as neural signal processing and affective computing. However, the unique brain anatomy of individuals leads to non-negligible natural differences in EEG signals across subjects, posing challenges for cross-subject emotion recognition. While recent studies have attempted to address these issues, they still face limitations in practical effectiveness and model framework unity. Current methods often struggle to capture the complex spatial-temporal dynamics of EEG signals and fail to effectively integrate multimodal information, resulting in suboptimal performance and limited generalizability across subjects. To overcome these limitations, we develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition that utilizes masked brain signal modeling and interlinked spatial-temporal attention mechanism. The model learns universal latent representations of EEG signals through pre-training on large scale dataset, and employs Interlinked spatial-temporal attention mechanism to process Differential Entropy(DE) features extracted from EEG data. Subsequently, a multi-level fusion layer is proposed to integrate the discriminative features, maximizing the advantages of features across different dimensions and modalities. Extensive experiments on public datasets demonstrate Mood Reader's superior performance in cross-subject emotion recognition tasks, outperforming state-of-the-art methods. Additionally, the model is dissected from attention perspective, providing qualitative analysis of emotion-related brain areas, offering valuable insights for affective research in neural signal processing.


Rethinking Channel Dependence for Multivariate Time Series Forecasting: Learning from Leading Indicators

arXiv.org Artificial Intelligence

Recently, channel-independent methods have achieved state-of-the-art performance in multivariate time series (MTS) forecasting. Despite reducing overfitting risks, these methods miss potential opportunities in utilizing channel dependence for accurate predictions. We argue that there exist locally stationary lead-lag relationships between variates, i.e., some lagged variates may follow the leading indicators within a short time period. Exploiting such channel dependence is beneficial since leading indicators offer advance information that can be used to reduce the forecasting difficulty of the lagged variates. In this paper, we propose a new method named LIFT that first efficiently estimates leading indicators and their leading steps at each time step and then judiciously allows the lagged variates to utilize the advance information from leading indicators. LIFT plays as a plugin that can be seamlessly collaborated with arbitrary time series forecasting methods. Extensive experiments on six real-world datasets demonstrate that LIFT improves the state-of-the-art methods by 5.5% in average forecasting performance. Multivariate time series (MTS) forecasting, one of the most popular research topics, is a fundamental task in various domains such as weather, traffic, and finance. An MTS consists of multiple channels (a.k.a., variates Many MTS forecasting researches argue each channel has dependence on other channels.


Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network

arXiv.org Artificial Intelligence

Recently, Their goal is to learn or search for optimal graph objects that heterogeneous graph neural networks (HGNs) have maximize mutual information with the predictions. While become one of the standard paradigms for modeling rich such explanations answer the question "what is salient to semantics of heterogeneous graphs in various application the prediction", they fail to unveil "how the salient objects domains such as e-commerce, finance, and healthcare (Lv affect the prediction". In particular, there may exist multiple et al. 2021; Wang et al. 2022). In parallel with the proliferation paths in the graph to propagate the information of the salient of HGNs, understanding the reasons behind the objects to the target object and affect its prediction. Without predictions from HGNs is urgently demanded in order to distinguishing these different influential paths, the answer to build trust and confidence in the models for both users and the "how" question remains unclear, which could compromise stakeholders. For example, a customer would be satisfied if the utility of the explanation. This issue becomes more an HGN-based recommender system accompanies recommended prominent when it comes to explaining HGNs due to the items with explanations; a bank manager may want complex semantics of heterogeneous graphs.


SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation

arXiv.org Artificial Intelligence

The acquisition of accurate rainfall distribution in space is an important task in hydrological analysis and natural disaster pre-warning. However, it is impossible to install rain gauges on every corner. Spatial interpolation is a common way to infer rainfall distribution based on available raingauge data. However, the existing works rely on some unrealistic pre-settings to capture spatial correlations, which limits their performance in real scenarios. To tackle this issue, we propose the SSIN, which is a novel data-driven self-supervised learning framework for rainfall spatial interpolation by mining latent spatial patterns from historical observation data. Inspired by the Cloze task and BERT, we fully consider the characteristics of spatial interpolation and design the SpaFormer model based on the Transformer architecture as the core of SSIN. Our main idea is: by constructing rich self-supervision signals via random masking, SpaFormer can learn informative embeddings for raw data and then adaptively model spatial correlations based on rainfall spatial context. Extensive experiments on two real-world raingauge datasets show that our method outperforms the state-of-the-art solutions. In addition, we take traffic spatial interpolation as another use case to further explore the performance of our method, and SpaFormer achieves the best performance on one large real-world traffic dataset, which further confirms the effectiveness and generality of our method.


Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey

arXiv.org Artificial Intelligence

Predicting stock prices presents a challenging research problem due to the inherent volatility and non-linear nature of the stock market. In recent years, knowledge-enhanced stock price prediction methods have shown groundbreaking results by utilizing external knowledge to understand the stock market. Despite the importance of these methods, there is a scarcity of scholarly works that systematically synthesize previous studies from the perspective of external knowledge types. Specifically, the external knowledge can be modeled in different data structures, which we group into non-graph-based formats and graph-based formats: 1) non-graph-based knowledge captures contextual information and multimedia descriptions specifically associated with an individual stock; 2) graph-based knowledge captures interconnected and interdependent information in the stock market. This survey paper aims to provide a systematic and comprehensive description of methods for acquiring external knowledge from various unstructured data sources and then incorporating it into stock price prediction models. We also explore fusion methods for combining external knowledge with historical price features. Moreover, this paper includes a compilation of relevant datasets and delves into potential future research directions in this domain.


DoubleAdapt: A Meta-learning Approach to Incremental Learning for Stock Trend Forecasting

arXiv.org Artificial Intelligence

Stock trend forecasting is a fundamental task of quantitative investment where precise predictions of price trends are indispensable. As an online service, stock data continuously arrive over time. It is practical and efficient to incrementally update the forecast model with the latest data which may reveal some new patterns recurring in the future stock market. However, incremental learning for stock trend forecasting still remains under-explored due to the challenge of distribution shifts (a.k.a. concept drifts). With the stock market dynamically evolving, the distribution of future data can slightly or significantly differ from incremental data, hindering the effectiveness of incremental updates. To address this challenge, we propose DoubleAdapt, an end-to-end framework with two adapters, which can effectively adapt the data and the model to mitigate the effects of distribution shifts. Our key insight is to automatically learn how to adapt stock data into a locally stationary distribution in favor of profitable updates. Complemented by data adaptation, we can confidently adapt the model parameters under mitigated distribution shifts. We cast each incremental learning task as a meta-learning task and automatically optimize the adapters for desirable data adaptation and parameter initialization. Experiments on real-world stock datasets demonstrate that DoubleAdapt achieves state-of-the-art predictive performance and shows considerable efficiency.