Goto

Collaborating Authors

 underestimation




Reliable Real-Time Value at Risk Estimation via Quantile Regression Forest with Conformal Calibration

Wang, Du-Yi, Liang, Guo, Zhang, Kun, Zhu, Qianwen

arXiv.org Machine Learning

Rapidly evolving market conditions call for real-time risk monitoring, but its online estimation remains challenging. In this paper, we study the online estimation of one of the most widely used risk measures, Value at Risk (VaR). Its accurate and reliable estimation is essential for timely risk control and informed decision-making. We propose to use the quantile regression forest in the offline-simulation-online-estimation (OSOA) framework. Specifically, the quantile regression forest is trained offline to learn the relationship between the online VaR and risk factors, and real-time VaR estimates are then produced online by incorporating observed risk factors. To further ensure reliability, we develop a conformalized estimator that calibrates the online VaR estimates. To the best of our knowledge, we are the first to leverage conformal calibration to estimate real-time VaR reliably based on the OSOA formulation. Theoretical analysis establishes the consistency and coverage validity of the proposed estimators. Numerical experiments confirm the proposed method and demonstrate its effectiveness in practice.




Mitigating Estimation Bias with Representation Learning in TD Error-Driven Regularization

Chen, Haohui, Chen, Zhiyong, Liu, Aoxiang, Fang, Wentuo

arXiv.org Artificial Intelligence

Deterministic policy gradient algorithms for continuous control suffer from value estimation biases that degrade performance. While double critics reduce such biases, the exploration potential of double actors remains underexplored. Building on temporal-difference error-driven regularization (TDDR), a double actor-critic framework, this work introduces enhanced methods to achieve flexible bias control and stronger representation learning. We propose three convex combination strategies, symmetric and asymmetric, that balance pessimistic estimates to mitigate overestimation and optimistic exploration via double actors to alleviate underestimation. A single hyperparameter governs this mechanism, enabling tunable control across the bias spectrum. To further improve performance, we integrate augmented state and action representations into the actor and critic networks. Extensive experiments show that our approach consistently outperforms benchmarks, demonstrating the value of tunable bias and revealing that both overestimation and underestimation can be exploited differently depending on the environment.


Hurdle-IMDL: An Imbalanced Learning Framework for Infrared Rainfall Retrieval

Zhang, Fangjian, Zhuge, Xiaoyong, Wang, Wenlan, Xiao, Haixia, Zhu, Yuying, Cheng, Siyang

arXiv.org Artificial Intelligence

Artificial intelligence has advanced quantitative remote sensing, yet its effectiveness is constrained by imbalanced label distribution. This imbalance leads conventionally trained models to favor common samples, which in turn degrades retrieval performance for rare ones. Rainfall retrieval exemplifies this issue, with performance particularly compromised for heavy rain. This study proposes Hurdle-Inversion Model Debiasing Learning (IMDL) framework. Following a divide-and-conquer strategy, imbalance in the rain distribution is decomposed into two components: zero inflation, defined by the predominance of non-rain samples; and long tail, defined by the disproportionate abundance of light-rain samples relative to heavy-rain samples. A hurdle model is adopted to handle the zero inflation, while IMDL is proposed to address the long tail by transforming the learning object into an unbiased ideal inverse model. Comprehensive evaluation via statistical metrics and case studies investigating rainy weather in eastern China confirms Hurdle-IMDL's superiority over conventional, cost-sensitive, generative, and multi-task learning methods. Its key advancements include effective mitigation of systematic underestimation and a marked improvement in the retrieval of heavy-to-extreme rain. IMDL offers a generalizable approach for addressing imbalance in distributions of environmental variables, enabling enhanced retrieval of rare yet high-impact events.




MAUSAM: An Observations-focused assessment of Global AI Weather Prediction Models During the South Asian Monsoon

Gupta, Aman, Sheshadri, Aditi, Suri, Dhruv

arXiv.org Artificial Intelligence

Accurate weather forecasts are critical for societal planning and disaster preparedness. Yet these forecasts remain challenging to produce and evaluate, especially in regions with sparse observational coverage. Current evaluation of artificial intelligence (AI) weather prediction relies primarily on reanalyses, which can obscure important deficiencies. Here we present MAUSAM (Measuring AI Uncertainty during South Asian Monsoon), an evaluation of seven leading AI-based forecasting systems - FourCastNet, FourCastNet-SFNO, Pangu-Weather, GraphCast, Aurora, AIFS, and GenCast - during the South Asian Monsoon, using ground-based weather stations, rain gauge networks, and geostationary satellite imagery. The AI models demonstrate impressive forecast skill during monsoon across a broad range of variables, ranging from large-scale surface temperature and winds to precipitation, cloud cover, and subseasonal to seasonal eddy statistics, highlighting the strength of data-driven weather prediction. However, the models still exhibit systematic errors at finer scales like the underprediction of extreme precipitation, divergent cyclone tracks, and the mesoscale kinetic energy spectra, highlighting avenues for future improvement. A comparison against observations reveals forecast errors up to 15-45% larger than those relative to reanalysis and traditional forecasts, indicating that reanalysis-centric benchmarks can overstate forecast skill. Of the models assessed, AIFS achieves the most consistent representation of atmospheric variables, with GraphCast and GenCast also showing strong skill. The analysis presents a framework for evaluating AI weather models on regional prediction and highlights both the promise and current limitations of AI weather prediction in data-sparse regions, underscoring the importance of observational evaluation for future operational adoption.