Neural Negative Binomial Regression for Weekly Seismicity Forecasting: Per-Cell Dispersion Estimation and Tail Risk Assessment

Igilik, Alim

arXiv.org Machine Learning 

Earthquake forecasting is a critical task for natural risk management, infrastructure resilience planning, and emergency response operations. For Central Asia, and the Tian Shan mountain system in particular, this problem carries heightened importance due to high tectonic activity, complex geodynamics, and pronounced spatiotemporal heterogeneity of seismic processes. In the applied setting, the goal is not a deterministic forecast of individual events, but a macroscopic forecast of seismicity intensity: estimating the expected number of earthquakes with magnitude M 3.0 on a spatial grid at a weekly horizon. Historically, count data forecasting in fixed spatiotemporal cells has been formulated within the Poisson framework. However, its key assumption--equality of the conditional mean and conditional variance--is systematically violated in real seismological data. Earthquakes exhibit pronounced clustering associated with swarm activity, foreshock-aftershock sequences, and episodes of anomalous activity, resulting in overdispersion in which the variance substantially exceeds the mean. Under these conditions, uncritical application of the Poisson distribution leads to biased uncertainty estimates and, consequently, to underestimation of the risk of extreme scenarios. Despite the widespread adoption of machine learning methods in seismological problems, a substantial portion of existing work remains methodologically vulnerable. On one hand, several approaches apply continuous regression loss functions and metrics (e.g., MSE), ignoring the