AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Revealing Geography-Driven Signals in Zone-Level Claim Frequency Models: An Empirical Study using Environmental and Visual Predictors

Alfonso-Sánchez, Sherly, Bravo, Cristián, Stankova, Kristina G.

arXiv.org Machine LearningApr-24-2026

Geographic context is often consider relevant to motor insurance risk, yet public actuarial datasets provide limited location identifiers, constraining how this information can be incorporated and evaluated in claim-frequency models. This study examines how geographic information from alternative data sources can be incorporated into actuarial models for Motor Third Party Liability (MTPL) claim prediction under such constraints. Using the BeMTPL97 dataset, we adopt a zone-level modeling framework and evaluate predictive performance on unseen postcodes. Geographic information is introduced through two channels: environmental indicators from OpenStreetMap and CORINE Land Cover, and orthoimagery released by the Belgian National Geographic Institute for academic use. We evaluate the predictive contribution of coordinates, environmental features, and image embeddings across three baseline models: generalized linear models (GLMs), regularized GLMs, and gradient-boosted trees, while raw imagery is modeled using convolutional neural networks. Our results show that augmenting actuarial variables with constructed geographic information improves accuracy. Across experiments, both linear and tree-based models benefit most from combining coordinates with environmental features extracted at 5 km scale, while smaller neighborhoods also improve baseline specifications. Generally, image embeddings do not improve performance when environmental features are available; however, when such features are absent, pretrained vision-transformer embeddings enhance accuracy and stability for regularized GLMs. Our results show that the predictive value of geographic information in zone-level MTPL frequency models depends less on model complexity than on how geography is represented, and illustrate that geographic context can be incorporated despite limited individual-level spatial information.

artificial intelligence, machine learning, postcode 2, (18 more...)

arXiv.org Machine Learning

2604.21893

Country:

South America > Colombia (0.04)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
Asia > Bangladesh (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance > Insurance (1.00)
Transportation > Ground > Road (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.87)

Add feedback

Calibeating Prediction-Powered Inference

van der Laan, Lars, Van Der Laan, Mark

arXiv.org Machine LearningApr-24-2026

We study semisupervised mean estimation with a small labeled sample, a large unlabeled sample, and a black-box prediction model whose output may be miscalibrated. A standard approach in this setting is augmented inverse-probability weighting (AIPW) [Robins et al., 1994], which protects against prediction-model misspecification but can be inefficient when the prediction score is poorly aligned with the outcome scale. We introduce Calibrated Prediction-Powered Inference, which post-hoc calibrates the prediction score on the labeled sample before using it for semisupervised estimation. This simple step requires no retraining and can improve the original score both as a predictor of the outcome and as a regression adjustment for semisupervised inference. We study both linear and isotonic calibration. For isotonic calibration, we establish first-order optimality guarantees: isotonic post-processing can improve predictive accuracy and estimator efficiency relative to the original score and simpler post-processing rules, while no further post-processing of the fitted isotonic score yields additional first-order gains. For linear calibration, we show first-order equivalence to PPI++. We also clarify the relationship among existing estimators, showing that the original PPI estimator is a special case of AIPW and can be inefficient when the prediction model is accurate, while PPI++ is AIPW with empirical efficiency maximization [Rubin et al., 2008]. In simulations and real-data experiments, our calibrated estimators often outperform PPI and are competitive with, or outperform, AIPW and PPI++. We provide an accompanying Python package, ppi_aipw, at https://larsvanderlaan.github.io/ppi-aipw/.

artificial intelligence, machine learning, modeling & simulation, (17 more...)

arXiv.org Machine Learning

2604.2126

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Health & Medicine > Health Care Technology > Medical Record (0.45)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.45)
Health & Medicine > Therapeutic Area > Oncology (0.45)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Bayesian Dyadic Trees and Histograms for Regression

Stéphanie van der Pas, Veronika Rockova

Neural Information Processing SystemsApr-23-2026, 22:22:39 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, partition, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Hunt For The Unique, Stable, Sparse And Fast Feature Learning On Graphs

Saurabh Verma, Zhi-Li Zhang

Neural Information Processing SystemsApr-23-2026, 21:04:58 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Spectrally-normalized margin bounds for neural networks

Neural Information Processing SystemsApr-23-2026, 17:48:28 GMT

This paper presents a margin-based multiclass generalization bound for neural networks that scales with their margin-normalized spectral complexity: their Lipschitz constant, meaning the product of the spectral norms of the weight matrices, times a certain correction factor. This bound is empirically investigated for a standard AlexNet network trained with SGD on the mnistand cifar10datasets, with both original and random labels; the bound, the Lipschitz constants, and the excess risks are all in direct correlation, suggesting both that SGD selects predictors whose complexity scales with the difficulty of the learning task, and secondly that the presented bound is sensitive to this complexity.

artificial intelligence, machine learning, margin distribution, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions

Aryeh Kontorovich, Sivan Sabato, Roi Weiss

Neural Information Processing SystemsApr-23-2026, 13:59:33 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.46)

Add feedback

An Empirical Study on The Properties of Random Bases for Kernel Methods

Neural Information Processing SystemsApr-23-2026, 13:59:15 GMT

Kernel machines as well as neural networks possess universal function approximation properties. Nevertheless in practice their ways of choosing the appropriate function class differ. Specifically neural networks learn a representation by adapting their basis functions to the data and the task at hand, while kernel methods typically use a basis that is not adapted during training. In this work, we contrast random features of approximated kernel machines with learned features of neural networks. Our analysis reveals how these random and adaptive basis functions affect the quality of learning. Furthermore, we present basis adaptation schemes that allow for a more compact representation, while retaining the generalization properties of kernel machines.

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Conic Scan-and-Cover algorithms for nonparametric topic modeling

Mikhail Yurochkin, Aritra Guha, XuanLong Nguyen

Neural Information Processing SystemsApr-23-2026, 13:58:39 GMT

We propose new algorithms for topic modeling when the number of topics is unknown. Our approach relies on an analysis of the concentration of mass and angular geometry of the topic simplex, a convex polytope constructed by taking the convex hull of vertices representing the latent topics. Our algorithms are shown in practice to have accuracy comparable to a Gibbs sampler in terms of topic estimation, which requires the number of topics be given. Moreover, they are one of the fastest among several state of the art parametric techniques.1 Statistical consistency of our estimator is established under some conditions.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Training Deep Networks without Learning Rates Through Coin Betting

Neural Information Processing SystemsApr-23-2026, 10:30:37 GMT

Deep learning methods achieve state-of-the-art performance in many application scenarios. Yet, these methods require a significant amount of hyperparameters tuning in order to achieve the best results. In particular, tuning the learning rates in the stochastic optimization process is still one of the main bottlenecks. In this paper, we propose a new stochastic gradient descent procedure for deep networks that does not require any learning rate setting. Contrary to previous methods, we do not adapt the learning rates nor we make use of the assumed curvature of the objective function. Instead, we reduce the optimization process to a game of betting on a coin and propose a learning rate free optimal algorithm for this scenario. Theoretical convergence is proven for convex and quasi-convex functions and empirical evidence shows the advantage of our algorithm over popular stochastic gradient algorithms.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: