Goto

Collaborating Authors

 prediction window




Improving Early Sepsis Onset Prediction Through Federated Learning

Düsing, Christoph, Cimiano, Philipp

arXiv.org Artificial Intelligence

Early and accurate prediction of sepsis onset remains a major challenge in intensive care, where timely detection and subsequent intervention can significantly improve patient outcomes. While machine learning models have shown promise in this domain, their success is often limited by the amount and diversity of training data available to individual hospitals and Intensive Care Units (ICUs). Federated Learning (FL) addresses this issue by enabling collaborative model training across institutions without requiring data sharing, thus preserving patient privacy. In this work, we propose a federated, attention-enhanced Long Short-Term Memory model for sepsis onset prediction, trained on multi-centric ICU data. Unlike existing approaches that rely on fixed prediction windows, our model supports variable prediction horizons, enabling both short- and long-term forecasting in a single unified model. During analysis, we put particular emphasis on the improvements through our approach in terms of early sepsis detection, i.e., predictions with large prediction windows by conducting an in-depth temporal analysis. Our results prove that using FL does not merely improve overall prediction performance (with performance approaching that of a centralized model), but is particularly beneficial for early sepsis onset prediction. Finally, we show that our choice of employing a variable prediction window rather than a fixed window does not hurt performance significantly but reduces computational, communicational, and organizational overhead.


SuryaBench: Benchmark Dataset for Advancing Machine Learning in Heliophysics and Space Weather Prediction

Roy, Sujit, Hegde, Dinesha V., Schmude, Johannes, Lin, Amy, Gaur, Vishal, Lal, Rohit, Mandal, Kshitiz, Singh, Talwinder, Muñoz-Jaramillo, Andrés, Yang, Kang, Pandey, Chetraj, Hong, Jinsu, Aydin, Berkay, McGranaghan, Ryan, Kasapis, Spiridon, Upendran, Vishal, Bahauddin, Shah, da Silva, Daniel, Freitag, Marcus, Gurung, Iksha, Pogorelov, Nikolai, Watson, Campbell, Maskey, Manil, Bernabe-Moreno, Juan, Ramachandran, Rahul

arXiv.org Artificial Intelligence

This paper introduces a high resolution, machine learning-ready heliophysics dataset derived from NASA's Solar Dynamics Observatory (SDO), specifically designed to advance machine learning (ML) applications in solar physics and space weather forecasting. The dataset includes processed imagery from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI), spanning a solar cycle from May 2010 to July 2024. To ensure suitability for ML tasks, the data has been preprocessed, including correction of spacecraft roll angles, orbital adjustments, exposure normalization, and degradation compensation. We also provide auxiliary application benchmark datasets complementing the core SDO dataset. These provide benchmark applications for central heliophysics and space weather tasks such as active region segmentation, active region emergence forecasting, coronal field extrapolation, solar flare prediction, solar EUV spectra prediction, and solar wind speed estimation. By establishing a unified, standardized data collection, this dataset aims to facilitate benchmarking, enhance reproducibility, and accelerate the development of AI-driven models for critical space weather prediction tasks, bridging gaps between solar physics, machine learning, and operational forecasting.


From On-chain to Macro: Assessing the Importance of Data Source Diversity in Cryptocurrency Market Forecasting

Demosthenous, Giorgos, Georgiou, Chryssis, Polydorou, Eliada

arXiv.org Artificial Intelligence

This study investigates the impact of data source diversity on the performance of cryptocurrency forecasting models by integrating various data categories, including technical indicators, on-chain metrics, sentiment and interest metrics, traditional market indices, and macroeconomic indicators. We introduce the Crypto100 index, representing the top 100 cryptocurrencies by market capitalization, and propose a novel feature reduction algorithm to identify the most impactful and resilient features from diverse data sources. Our comprehensive experiments demonstrate that data source diversity significantly enhances the predictive performance of forecasting models across different time horizons. Key findings include the paramount importance of on-chain metrics for both short-term and long-term predictions, the growing relevance of traditional market indices and macroeconomic indicators for longer-term forecasts, and substantial improvements in model accuracy when diverse data sources are utilized. These insights help demystify the short-term and long-term driving factors of the cryptocurrency market and lay the groundwork for developing more accurate and resilient forecasting models.


Modification of a Numerical Method Using FIR Filters in a Time-dependent SIR Model for COVID-19

Pimentel, Felipe Rogério, Alves, Rafael Gustavo

arXiv.org Artificial Intelligence

Authors Yi-Cheng Chen, Ping-En Lu, Cheng-Shang Chang, and Tzu-Hsuan Liu use the Finite Impulse Response (FIR) linear system filtering method to track and predict the number of people infected and recovered from COVID-19, in a pandemic context in which there was still no vaccine and the only way to avoid contagion was isolation. To estimate the coefficients of these FIR filters, Chen et al. used machine learning methods through a classical optimization problem with regularization (ridge regression). These estimated coefficients are called ridge coefficients. The epidemic mathematical model adopted by these researchers to formulate the FIR filters is the time-dependent discrete SIR. In this paper, we propose a small modification to the algorithm of Chen et al. to obtain the ridge coefficients. We then used this modified algorithm to track and predict the number of people infected and recovered from COVID-19 in the state of Minas Gerais/Brazil, within a prediction window, during the initial period of the pandemic. We also compare the predicted data with the respective real data to check how good the approximation is. In the modified algorithm, we set values for the FIR filter orders and for the regularization parameters, both different from the respective values defined by Chen et al. in their algorithm. In this context, the numerical results obtained by the modified algorithm in some simulations present better approximation errors compared to the respective approximation errors presented by the algorithm of Chen et al.


CRAFT: Time Series Forecasting with Cross-Future Behavior Awareness

Zhang, Yingwei, Bu, Ke, Zhuang, Zhuoran, Xie, Tao, Yu, Yao, Li, Dong, Guo, Yang, Lv, Detao

arXiv.org Artificial Intelligence

The past decades witness the significant advancements in time series forecasting (TSF) across various real-world domains, including e-commerce and disease spread prediction. However, TSF is usually constrained by the uncertainty dilemma of predicting future data with limited past observations. To settle this question, we explore the use of Cross-Future Behavior (CFB) in TSF, which occurs before the current time but takes effect in the future. We leverage CFB features and propose the CRoss-Future Behavior Awareness based Time Series Forecasting method (CRAFT). The core idea of CRAFT is to utilize the trend of cross-future behavior to mine the trend of time series data to be predicted. Specifically, to settle the sparse and partial flaws of cross-future behavior, CRAFT employs the Koopman Predictor Module to extract the key trend and the Internal Trend Mining Module to supplement the unknown area of the cross-future behavior matrix. Then, we introduce the External Trend Guide Module with a hierarchical structure to acquire more representative trends from higher levels. Finally, we apply the demand-constrained loss to calibrate the distribution deviation of prediction results. We conduct experiments on real-world dataset. Experiments on both offline large-scale dataset and online A/B test demonstrate the effectiveness of CRAFT. Our dataset and code is available at https://github.com/CRAFTinTSF/CRAFT.


Large Language Models for Drug Overdose Prediction from Longitudinal Medical Records

Nahian, Md Sultan Al, Delcher, Chris, Harris, Daniel, Akpunonu, Peter, Kavuluru, Ramakanth

arXiv.org Artificial Intelligence

-- The ability to predict drug overdose risk from a patient's medical records is crucial for timely intervention and prevention. Traditional machine learning models have shown promise in analyzing longitudinal medical records for this task. However, recent advancements in large language models (LLMs) offer an opportunity to enhance prediction performance by leveraging their ability to process long textual data and their inherent prior knowledge across diverse tasks. In this study, we assess the effectiveness of Open AI's GPT -4o LLM in predicting drug overdose events using patients' longitudinal insurance claims records. We evaluate its performance in both fine-tuned and zero-shot settings, comparing them to strong traditional machine learning methods as baselines. Our results show that LLMs not only outperform traditional models in certain settings but can also predict overdose risk in a zero-shot setting without task-specific training. Drug overdose (OD) is a major public health crisis in the United States, leading to a substantial number of emergency medical interventions and fatalities each year. According to the Centers for Disease Control and Prevention (CDC), drug overdoses claimed approximately 107,941 [1] lives in the U.S. in 2022, highlighting the urgent need for effective prevention and intervention strategies. Besides fatal outcomes and lost quality of life for patients, the misuse of prescription medications, illicit drugs, and polysubstance abuse has placed an immense burden on healthcare systems, emergency responders, and policymakers. Identifying individuals at risk early can facilitate timely interventions, such as targeted clinical assessments, behavioral support, and prescription monitoring, thereby reducing the likelihood of fatal outcomes. Md Sultan Al Nahian is with the Institute for Biomedical Informatics, University of Kentucky, Lexington, KY 40536 USA. Chris Delcher and Daniel Harris are with the Department of Pharmacy Practice and Science, University of Kentucky, Lexington, KY 40536 USA. Peter Akpunonu is with the Department of Emergency Medicine, University of Kentucky, Lexington, KY 40536 USA.


Comparative Analysis of Deep Learning Models for Real-World ISP Network Traffic Forecasting

Koumar, Josef, Smoleň, Timotej, Jeřábek, Kamil, Čejka, Tomáš

arXiv.org Artificial Intelligence

Traffic monitoring is a cornerstone of effective network management and cybersecurity, providing Internet Service Providers (ISPs) with critical insights to detect anomalies, mitigate congestion, and maintain network performance [1]. The surge in video streaming, cloud computing, and online gaming is driving rapid growth in internet usage, contributing to increasingly complex and less predictable network traffic. Efficient network monitoring allows ISPs to maintain service quality, mitigate security risks, and optimize bandwidth in real time [2]. However, real-time monitoring alone is insufficient for proactively managing network resources. To anticipate variations in demand and prevent service disruptions, ISPs increasingly adopt advanced forecasting techniques to predict traffic patterns and optimize resource allocation in advance [3]. Accurate traffic forecasting allows ISPs to efficiently allocate resources, scale network capacity, and sustain service quality under fluctuating loads [3]. The rise of diverse, high-bandwidth services has significantly increased network traffic variability. Traditional models like ARIMA and exponential smoothing, which assume linearity, struggle with ISP data due to prevalent non-linear and high-frequency fluctuations, especially during peak traffic hours [4]. These limitations have driven the adoption of deep learning models, particularly neural networks, which excel at capturing complex temporal dependencies across various forecasting domains [5].


Zero-shot Medical Event Prediction Using a Generative Pre-trained Transformer on Electronic Health Records

Redekop, Ekaterina, Wang, Zichen, Kulkarni, Rushikesh, Pleasure, Mara, Chin, Aaron, Hassanzadeh, Hamid Reza, Hill, Brian L., Emami, Melika, Speier, William, Arnold, Corey W.

arXiv.org Artificial Intelligence

Longitudinal data in electronic health records (EHRs) represent an individual`s clinical history through a sequence of codified concepts, including diagnoses, procedures, medications, and laboratory tests. Foundational models, such as generative pre-trained transformers (GPT), can leverage this data to predict future events. While fine-tuning of these models enhances task-specific performance, it is costly, complex, and unsustainable for every target. We show that a foundation model trained on EHRs can perform predictive tasks in a zero-shot manner, eliminating the need for fine-tuning. This study presents the first comprehensive analysis of zero-shot forecasting with GPT-based foundational models in EHRs, introducing a novel pipeline that formulates medical concept prediction as a generative modeling task. Unlike supervised approaches requiring extensive labeled data, our method enables the model to forecast a next medical event purely from a pretraining knowledge. We evaluate performance across multiple time horizons and clinical categories, demonstrating model`s ability to capture latent temporal dependencies and complex patient trajectories without task supervision. Model performance for predicting the next medical concept was evaluated using precision and recall metrics, achieving an average top1 precision of 0.614 and recall of 0.524. For 12 major diagnostic conditions, the model demonstrated strong zero-shot performance, achieving high true positive rates while maintaining low false positives. We demonstrate the power of a foundational EHR GPT model in capturing diverse phenotypes and enabling robust, zero-shot forecasting of clinical outcomes. This capability enhances the versatility of predictive healthcare models and reduces the need for task-specific training, enabling more scalable applications in clinical settings.