Goto

Collaborating Authors

 model forecast


Why Did This Model Forecast This Future? Information-Theoretic Saliency for Counterfactual Explanations of Probabilistic Regression Models

Neural Information Processing Systems

We propose a post hoc saliency-based explanation framework for counterfactual reasoning in probabilistic multivariate time-series forecasting (regression) settings. Building upon Miller's framework of explanations derived from research in multiple social science disciplines, we establish a conceptual link between counterfactual reasoning and saliency-based explanation techniques. To address the lack of a principled notion of saliency, we leverage a unifying definition of information-theoretic saliency grounded in preattentive human visual cognition and extend it to forecasting settings. Specifically, we obtain a closed-form expression for commonly used density functions to identify which observed timesteps appear salient to an underlying model in making its probabilistic forecasts. We empirically validate our framework in a principled manner using synthetic data to establish ground-truth saliency that is unavailable for real-world data. Finally, using real-world data and forecasting models, we demonstrate how our framework can assist domain experts in forming new data-driven hypotheses about the causal relationships between features in the wild.


Investigating Compositional Reasoning in Time Series Foundation Models

Potosnak, Willa, Challu, Cristian, Goswami, Mononito, Olivares, Kin G., Wiliński, Michał, Żukowska, Nina, Dubrawski, Artur

arXiv.org Artificial Intelligence

Large pre-trained time series foundation models (TSFMs) have demonstrated promising zero-shot performance across a wide range of domains. However, a question remains: Do TSFMs succeed solely by memorizing training patterns, or do they possess the ability to reason? While reasoning is a topic of great interest in the study of Large Language Models (LLMs), it is undefined and largely unexplored in the context of TSFMs. In this work, inspired by language modeling literature, we formally define compositional reasoning in forecasting and distinguish it from in-distribution generalization. We evaluate the reasoning and generalization capabilities of 23 popular deep learning forecasting models on multiple synthetic and real-world datasets. Additionally, through controlled studies, we systematically examine which design choices in TSFMs contribute to improved reasoning abilities. Our study yields key insights into the impact of TSFM architecture design on compositional reasoning and generalization. We find that patch-based Transformers have the best reasoning performance, closely followed by residualized MLP-based architectures, which are 97\% less computationally complex in terms of FLOPs and 86\% smaller in terms of the number of trainable parameters. Interestingly, in some zero-shot out-of-distribution scenarios, these models can outperform moving average and exponential smoothing statistical baselines trained on in-distribution data. Only a few design choices, such as the tokenization method, had a significant (negative) impact on Transformer model performance.


Why Did This Model Forecast This Future? Information-Theoretic Saliency for Counterfactual Explanations of Probabilistic Regression Models

Neural Information Processing Systems

We propose a post hoc saliency-based explanation framework for counterfactual reasoning in probabilistic multivariate time-series forecasting (regression) settings. Building upon Miller's framework of explanations derived from research in multiple social science disciplines, we establish a conceptual link between counterfactual reasoning and saliency-based explanation techniques. To address the lack of a principled notion of saliency, we leverage a unifying definition of information-theoretic saliency grounded in preattentive human visual cognition and extend it to forecasting settings. Specifically, we obtain a closed-form expression for commonly used density functions to identify which observed timesteps appear salient to an underlying model in making its probabilistic forecasts. We empirically validate our framework in a principled manner using synthetic data to establish ground-truth saliency that is unavailable for real-world data.


Hybrid Forecasting of Geopolitical Events

Benjamin, Daniel M., Morstatter, Fred, Abbas, Ali E., Abeliuk, Andres, Atanasov, Pavel, Bennett, Stephen, Beger, Andreas, Birari, Saurabh, Budescu, David V., Catasta, Michele, Ferrara, Emilio, Haravitch, Lucas, Himmelstein, Mark, Hossain, KSM Tozammel, Huang, Yuzhong, Jin, Woojeong, Joseph, Regina, Leskovec, Jure, Matsui, Akira, Mirtaheri, Mehrnoosh, Ren, Xiang, Satyukov, Gleb, Sethi, Rajiv, Singh, Amandeep, Sosic, Rok, Steyvers, Mark, Szekely, Pedro A, Ward, Michael D., Galstyan, Aram

arXiv.org Artificial Intelligence

Sound decision-making relies on accurate prediction for tangible outcomes ranging from military conflict to disease outbreaks. To improve crowdsourced forecasting accuracy, we developed SAGE, a hybrid forecasting system that combines human and machine generated forecasts. The system provides a platform where users can interact with machine models and thus anchor their judgments on an objective benchmark. The system also aggregates human and machine forecasts weighting both for propinquity and based on assessed skill while adjusting for overconfidence. We present results from the Hybrid Forecasting Competition (HFC) - larger than comparable forecasting tournaments - including 1085 users forecasting 398 real-world forecasting problems over eight months. Our main result is that the hybrid system generated more accurate forecasts compared to a human-only baseline which had no machine generated predictions. We found that skilled forecasters who had access to machine-generated forecasts outperformed those who only viewed historical data. We also demonstrated the inclusion of machine-generated forecasts in our aggregation algorithms improved performance, both in terms of accuracy and scalability. This suggests that hybrid forecasting systems, which potentially require fewer human resources, can be a viable approach for maintaining a competitive level of accuracy over a larger number of forecasting questions.


Short-term precipitation prediction using deep learning

Chen, Guoxing, Wang, Wei-Chyung

arXiv.org Artificial Intelligence

Accurate weather prediction is essential for many aspects of life, notably the early warning of extreme weather events such as rainstorms. Short-term predictions of these events rely on forecasts from numerical weather models, in which, despite much improvement in the past decades, outstanding issues remain concerning model uncertainties, and increasing demands for computation and storage resources. In recent years, the advance of deep learning offers a viable alternative approach. Here, we show that a 3D convolutional neural network using a single frame of meteorology fields as input is capable of predicting the precipitation spatial distribution. The network is developed based on 39-years (1980-2018) data of meteorology and daily precipitation over the contiguous United States. The results bring fundamental advancements in weather prediction. First, the trained network alone outperforms the state-of-the-art weather models in predicting daily total precipitation, and the superiority of the network extends to forecast leads up to 5 days. Second, combining the network predictions with the weather-model forecasts significantly improves the accuracy of model forecasts, especially for heavy-precipitation events. Third, the millisecond-scale inference time of the network facilitates large ensemble predictions for further accuracy improvement. These findings strongly support the use of deep-learning in short-term weather predictions.


Does Machine Learning reconstruct missing sunspots and forecast a new solar minimum?

#artificialintelligence

The retrodiction and prediction of solar activity are two closely-related problems in dynamo theory. We applied Machine Learning (ML) algorithms and analyses to the World Data Center’s newly constructed annual sunspot time series (1700–2019; Version 2.0). This provides a unique model that gives insights into the various patterns of the Sun’s magnetic dynamo that drives solar activity maxima and minima. We found that the variability in the ~11-year Sunspot Cycle is closely connected with 120-year oscillatory magnetic activity variations. We also identified a previously under-reported 5.5 year periodicity in the sunspot record.