Xu, Yijing
Enhancing Financial Time-Series Forecasting with Retrieval-Augmented Large Language Models
Xiao, Mengxi, Jiang, Zihao, Qian, Lingfei, Chen, Zhengyu, He, Yueru, Xu, Yijing, Jiang, Yuecheng, Li, Dong, Weng, Ruey-Ling, Peng, Min, Huang, Jimin, Ananiadou, Sophia, Xie, Qianqian
Stock movement prediction, a critical task in financial time-series forecasting, relies on identifying and retrieving key influencing factors from vast and complex datasets. However, traditional text-trained or numeric similarity-based retrieval methods often struggle to handle the intricacies of financial data. To address this, we propose the first retrieval-augmented generation (RAG) framework specifically designed for financial time-series forecasting. Our framework incorporates three key innovations: a fine-tuned 1B large language model (StockLLM) as its backbone, a novel candidate selection method enhanced by LLM feedback, and a training objective that maximizes the similarity between queries and historically significant sequences. These advancements enable our retriever, FinSeer, to uncover meaningful patterns while effectively minimizing noise in complex financial datasets. To support robust evaluation, we also construct new datasets that integrate financial indicators and historical stock prices. Experimental results demonstrate that our RAG framework outperforms both the baseline StockLLM and random retrieval methods, showcasing its effectiveness. FinSeer, as the retriever, achieves an 8% higher accuracy on the BIGDATA22 benchmark and retrieves more impactful sequences compared to existing retrieval methods. This work highlights the importance of tailored retrieval models in financial forecasting and provides a novel, scalable framework for future research in the field.
FinBen: A Holistic Financial Benchmark for Large Language Models
Xie, Qianqian, Han, Weiguang, Chen, Zhengyu, Xiang, Ruoyu, Zhang, Xiao, He, Yueru, Xiao, Mengxi, Li, Dong, Dai, Yongfu, Feng, Duanyu, Xu, Yijing, Kang, Haoqiang, Kuang, Ziyan, Yuan, Chenhan, Yang, Kailai, Luo, Zheheng, Zhang, Tianlin, Liu, Zhiwei, Xiong, Guojun, Deng, Zhiyang, Jiang, Yuechen, Yao, Zhiyuan, Li, Haohang, Yu, Yangyang, Hu, Gang, Huang, Jiajia, Liu, Xiao-Yang, Lopez-Lira, Alejandro, Wang, Benyou, Lai, Yanzhao, Wang, Hao, Peng, Min, Ananiadou, Sophia, Huang, Jimin
LLMs have transformed NLP and shown promise in various fields, yet their potential in finance is underexplored due to a lack of comprehensive evaluation benchmarks, the rapid development of LLMs, and the complexity of financial tasks. In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading. Our evaluation of 15 representative LLMs, including GPT-4, ChatGPT, and the latest Gemini, reveals several key findings: While LLMs excel in IE and textual analysis, they struggle with advanced reasoning and complex tasks like text generation and forecasting. GPT-4 excels in IE and stock trading, while Gemini is better at text generation and forecasting. Instruction-tuned LLMs improve textual analysis but offer limited benefits for complex tasks such as QA. FinBen has been used to host the first financial LLMs shared task at the FinNLP-AgentScen workshop during IJCAI-2024, attracting 12 teams. Their novel solutions outperformed GPT-4, showcasing FinBen's potential to drive innovation in financial LLMs.