Hu, Min
Dual-Splitting Conformal Prediction for Multi-Step Time Series Forecasting
Yu, Qingdi, Cao, Zhiwei, Wang, Ruihang, Yang, Zhen, Deng, Lijun, Hu, Min, Luo, Yong, Zhou, Xin
Time series forecasting is crucial for applications like resource scheduling and risk management, where multi-step predictions provide a comprehensive view of future trends. Uncertainty Quantification (UQ) is a mainstream approach for addressing forecasting uncertainties, with Conformal Prediction (CP) gaining attention due to its model-agnostic nature and statistical guarantees. However, most variants of CP are designed for single-step predictions and face challenges in multi-step scenarios, such as reliance on real-time data and limited scalability. This highlights the need for CP methods specifically tailored to multi-step forecasting. We propose the Dual-Splitting Conformal Prediction (DSCP) method, a novel CP approach designed to capture inherent dependencies within time-series data for multi-step forecasting. Experimental results on real-world datasets from four different domains demonstrate that the proposed DSCP significantly outperforms existing CP variants in terms of the Winkler Score, achieving a performance improvement of up to 23.59% compared to state-of-the-art methods. Furthermore, we deployed the DSCP approach for renewable energy generation and IT load forecasting in power management of a real-world trajectory-based application, achieving an 11.25% reduction in carbon emissions through predictive optimization of data center operations and controls.
ReverseNER: A Self-Generated Example-Driven Framework for Zero-Shot Named Entity Recognition with Large Language Models
Wang, Anbang, Mei, Difei, Zhang, Zhichao, Bai, Xiuxiu, Yao, Ran, Fang, Zewen, Hu, Min, Cao, Zhirui, Sun, Haitao, Guo, Yifeng, Zhou, Hongyao, Guo, Yu
This paper presents ReverseNER, a method aimed at overcoming the limitation of large language models (LLMs) in zero-shot named entity recognition (NER) tasks, arising from their reliance on pre-provided demonstrations. ReverseNER tackles this challenge by constructing a reliable example library composed of dozens of entity-labeled sentences, generated through the reverse process of NER. Specifically, while conventional NER methods label entities in a sentence, ReverseNER features reversing the process by using an LLM to generate entities from their definitions and subsequently expand them into full sentences. During the entity expansion process, the LLM is guided to generate sentences by replicating the structures of a set of specific \textsl{feature sentences}, extracted from the task sentences by clustering. This expansion process produces dozens of entity-labeled task-relevant sentences. After constructing the example library, the method selects several semantically similar entity-labeled examples for each task sentence as references to facilitate the LLM's entity recognition. We also propose an entity-level self-consistency scoring mechanism to improve NER performance with LLMs. Experiments show that ReverseNER significantly outperforms other zero-shot NER methods with LLMs, marking a notable improvement in NER for domains without labeled data, while declining computational resource consumption.
Futures Quantitative Investment with Heterogeneous Continual Graph Neural Network
Hu, Min, Tan, Zhizhong, Liu, Bin, Yin, Guosheng
This study aims to address the challenges of futures price prediction in high-frequency trading (HFT) by proposing a continuous learning factor predictor based on graph neural networks. The model integrates multi-factor pricing theories with real-time market dynamics, effectively bypassing the limitations of existing methods that lack financial theory guidance and ignore various trend signals and their interactions. We propose three heterogeneous tasks, including price moving average regression, price gap regression and change-point detection to trace the short-, intermediate-, and long-term trend factors present in the data. In addition, this study also considers the cross-sectional correlation characteristics of future contracts, where prices of different futures often show strong dynamic correlations. Each variable (future contract) depends not only on its historical values (temporal) but also on the observation of other variables (cross-sectional). To capture these dynamic relationships more accurately, we resort to the spatio-temporal graph neural network (STGNN) to enhance the predictive power of the model. The model employs a continuous learning strategy to simultaneously consider these tasks (factors). Additionally, due to the heterogeneity of the tasks, we propose to calculate parameter importance with mutual information between original observations and the extracted features to mitigate the catastrophic forgetting (CF) problem. Empirical tests on 49 commodity futures in China's futures market demonstrate that the proposed model outperforms other state-of-the-art models in terms of prediction accuracy. Not only does this research promote the integration of financial theory and deep learning, but it also provides a scientific basis for actual trading decisions.
Adaptive loose optimization for robust question answering
Ma, Jie, Wang, Pinghui, Wang, Zewei, Kong, Dechen, Hu, Min, Han, Ting, Liu, Jun
Question answering methods are well-known for leveraging data bias, such as the language prior in visual question answering and the position bias in machine reading comprehension (extractive question answering). Current debiasing methods often come at the cost of significant in-distribution performance to achieve favorable out-of-distribution generalizability, while non-debiasing methods sacrifice a considerable amount of out-of-distribution performance in order to obtain high in-distribution performance. Therefore, it is challenging for them to deal with the complicated changing real-world situations. In this paper, we propose a simple yet effective novel loss function with adaptive loose optimization, which seeks to make the best of both worlds for question answering. Our main technical contribution is to reduce the loss adaptively according to the ratio between the previous and current optimization state on mini-batch training data. This loose optimization can be used to prevent non-debiasing methods from overlearning data bias while enabling debiasing methods to maintain slight bias learning. Experiments on the visual question answering datasets, including VQA v2, VQA-CP v1, VQA-CP v2, GQA-OOD, and the extractive question answering dataset SQuAD demonstrate that our approach enables QA methods to obtain state-of-the-art in- and out-of-distribution performance in most cases. The source code has been released publicly in \url{https://github.com/reml-group/ALO}.
Neural CRF transducers for sequence labeling
Hu, Kai, Ou, Zhijian, Hu, Min, Feng, Junlan
Conditional random fields (CRFs) have been shown to be one of the most successful approaches to sequence labeling. Various linear-chain neural CRFs (NCRFs) are developed to implement the non-linear node potentials in CRFs, but still keeping the linear-chain hidden structure. In this paper, we propose NCRF transducers, which consists of two RNNs, one extracting features from observations and the other capturing (theoretically infinite) long-range dependencies between labels. Different sequence labeling methods are evaluated over POS tagging, chunking and NER (English, Dutch). Experiment results show that NCRF transducers achieve consistent improvements over linear-chain NCRFs and RNN transducers across all the four tasks, and can improve state-of-the-art results.