AITopics | arXiv.org Machine Learning

Plotting

arXiv.org Machine Learning

Neural Bayes inference for complex bivariate extremal dependence models

André, Lídia M., Wadsworth, Jennifer L., Huser, Raphaël

arXiv.org Machine LearningMar-29-2025

Likelihood-free approaches are appealing for performing inference on complex dependence models, either because it is not possible to formulate a likelihood function, or its evaluation is very computationally costly. This is the case for several models available in the multivariate extremes literature, particularly for the most flexible tail models, including those that interpolate between the two key dependence classes of `asymptotic dependence' and `asymptotic independence'. We focus on approaches that leverage neural networks to approximate Bayes estimators. In particular, we explore the properties of neural Bayes estimators for parameter inference for several flexible but computationally expensive models to fit, with a view to aiding their routine implementation. Owing to the absence of likelihood evaluation in the inference procedure, classical information criteria such as the Bayesian information criterion cannot be used to select the most appropriate model. Instead, we propose using neural networks as neural Bayes classifiers for model selection. Our goal is to provide a toolbox for simple, fast fitting and comparison of complex extreme-value dependence models, where the best model is selected for a given data set and its parameters subsequently estimated using neural Bayes estimation. We apply our classifiers and estimators to analyse the pairwise extremal behaviour of changes in horizontal geomagnetic field fluctuations at three different locations.

artificial intelligence, machine learning, nbe, (18 more...)

arXiv.org Machine Learning

2503.23156

Country:

Europe (0.46)
North America (0.46)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Data-driven Seasonal Climate Predictions via Variational Inference and Transformers

Palma, Lluís, Peraza, Alejandro, Civantos, David, Duarte, Amanda, Materia, Stefano, Muñoz, Ángel G., Peña-Izquierdo, Jesús, Romero, Laia, Soret, Albert, Donat, Markus G.

arXiv.org Machine LearningMar-28-2025

Most operational climate services providers base their seasonal predictions on initialised general circulation models (GCMs) or statistical techniques that fit past observations. GCMs require substantial computational resources, which limits their capacity. In contrast, statistical methods often lack robustness due to short historical records. Recent works propose machine learning methods trained on climate model output, leveraging larger sample sizes and simulated scenarios. Yet, many of these studies focus on prediction tasks that might be restricted in spatial extent or temporal coverage, opening a gap with existing operational predictions. Thus, the present study evaluates the effectiveness of a methodology that combines variational inference with transformer models to predict fields of seasonal anomalies. The predictions cover all four seasons and are initialised one month before the start of each season. The model was trained on climate model output from CMIP6 and tested using ERA5 reanalysis data. We analyse the method's performance in predicting interannual anomalies beyond the climate change-induced trend. We also test the proposed methodology in a regional context with a use case focused on Europe. While climate change trends dominate the skill of temperature predictions, the method presents additional skill over the climatological forecast in regions influenced by known teleconnections. We reach similar conclusions based on the validation of precipitation predictions. Despite underperforming SEAS5 in most tropics, our model offers added value in numerous extratropical inland regions. This work demonstrates the effectiveness of training generative models on climate model output for seasonal predictions, providing skilful predictions beyond the induced climate change trend at time scales and lead times relevant for user applications.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

2503.20466

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Nearest Neighbour Equilibrium Clustering

Hofmeyr, David P.

arXiv.org Machine LearningMar-28-2025

A novel and intuitive nearest neighbours based clustering algorithm is introduced, in which a cluster is defined in terms of an equilibrium condition which balances its size and cohesiveness. The formulation of the equilibrium condition allows for a quantification of the strength of alignment of each point to a cluster, with these cluster alignment strengths leading naturally to a model selection criterion which renders the proposed approach fully automatable. The algorithm is simple to implement and computationally efficient, and produces clustering solutions of extremely high quality in comparison with relevant benchmarks from the literature. R code to implement the approach is available from https://github.com/DavidHofmeyr/ I. Introduction Clustering, or cluster analysis, is the task of partitioning a set of data into groups, or clusters, which are seen to be relatively more homogeneous than the data as a whole. Clustering is one of the fundamental data analytic tasks, and forms an integral component of exploratory data analysis. Clustering is also of arguably increasing relevance, as data are increasingly being collected/generated from automated processes, where typically very little prior knowledge is available, making exploratory methods a necessity. In the classical clustering problem there is no explicit information about how the data should be grouped, and various interpretations of how clusters of points may be defined have led to the development of a very large number of methods for identifying them. Almost universally, however, clusters are determined from the geometric properties of the data, with pairs of points which are near to one another typically being seen as likely to be in the same cluster and pairs which are distant more likely to be in different clusters.

artificial intelligence, equilibrium cluster, machine learning, (16 more...)

arXiv.org Machine Learning

2503.21431

Country: Europe > Austria (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Nested Stochastic Gradient Descent for (Generalized) Sinkhorn Distance-Regularized Distributionally Robust Optimization

Yang, Yufeng, Zhou, Yi, Lu, Zhaosong

arXiv.org Machine LearningMar-28-2025

Distributionally robust optimization (DRO) is a powerful technique to train robust models against data distribution shift. This paper aims to solve regularized nonconvex DRO problems, where the uncertainty set is modeled by a so-called generalized Sinkhorn distance and the loss function is nonconvex and possibly unbounded. Such a distance allows to model uncertainty of distributions with different probability supports and divergence functions. For this class of regularized DRO problems, we derive a novel dual formulation taking the form of nested stochastic programming, where the dual variable depends on the data sample. To solve the dual problem, we provide theoretical evidence to design a nested stochastic gradient descent (SGD) algorithm, which leverages stochastic approximation to estimate the nested stochastic gradients. We study the convergence rate of nested SGD and establish polynomial iteration and sample complexities that are independent of the data size and parameter dimension, indicating its potential for solving large-scale DRO problems. We conduct numerical experiments to demonstrate the efficiency and robustness of the proposed algorithm.

artificial intelligence, machine learning, sinkhorn dro, (17 more...)

arXiv.org Machine Learning

2503.22923

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Unveiling the Power of Uncertainty: A Journey into Bayesian Neural Networks for Stellar dating

Tamames-Rodero, Víctor, Moya, Andrés, López, Roberto Javier, Sarro, Luis Manuel

arXiv.org Machine LearningMar-27-2025

Context: Astronomy and astrophysics demand rigorous handling of uncertainties to ensure the credibility of outcomes. The growing integration of artificial intelligence offers a novel avenue to address this necessity. This convergence presents an opportunity to create advanced models capable of quantifying diverse sources of uncertainty and automating complex data relationship exploration. What: We introduce a hierarchical Bayesian architecture whose probabilistic relationships are modeled by neural networks, designed to forecast stellar attributes such as mass, radius, and age (our main target). This architecture handles both observational uncertainties stemming from measurements and epistemic uncertainties inherent in the predictive model itself. As a result, our system generates distributions that encapsulate the potential range of values for our predictions, providing a comprehensive understanding of their variability and robustness. Methods: Our focus is on dating main sequence stars using a technique known as Chemical Clocks, which serves as both our primary astronomical challenge and a model prototype. In this work, we use hierarchical architectures to account for correlations between stellar parameters and optimize information extraction from our dataset. We also employ Bayesian neural networks for their versatility and flexibility in capturing complex data relationships. Results: By integrating our machine learning algorithm into a Bayesian framework, we have successfully propagated errors consistently and managed uncertainty treatment effectively, resulting in predictions characterized by broader uncertainty margins. This approach facilitates more conservative estimates in stellar dating. Our architecture achieves age predictions with a mean absolute error of less than 1 Ga for the stars in the test dataset.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2503.21153

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.93)

Add feedback

A Comprehensive Benchmark for RNA 3D Structure-Function Modeling

Wyss, Luis, Mallet, Vincent, Karroucha, Wissam, Borgwardt, Karsten, Oliver, Carlos

arXiv.org Machine LearningMar-27-2025

The RNA structure-function relationship has recently garnered significant attention within the deep learning community, promising to grow in importance as nucleic acid structure models advance. However, the absence of standardized and accessible benchmarks for deep learning on RNA 3D structures has impeded the development of models for RNA functional characteristics. In this work, we introduce a set of seven benchmarking datasets for RNA structure-function prediction, designed to address this gap. Our library builds on the established Python library rnaglib, and offers easy data distribution and encoding, splitters and evaluation methods, providing a convenient all-in-one framework for comparing models. Datasets are implemented in a fully modular and reproducible manner, facilitating for community contributions and customization. Finally, we provide initial baseline results for all tasks using a graph neural network. Source code: https://github.com/cgoliver/rnaglib Documentation: https://rnaglib.org

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2503.21681

Country:

Europe (0.68)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

tempdisagg: A Python Framework for Temporal Disaggregation of Time Series Data

Vera-Jaramillo, Jaime

arXiv.org Machine LearningMar-27-2025

tempdisagg is a modern, extensible, and production-ready Python framework for temporal disaggregation of time series data. It transforms low-frequency aggregates into consistent, high-frequency estimates using a wide array of econometric techniques-including Chow-Lin, Denton, Litterman, Fernandez, and uniform interpolation-as well as enhanced variants with automated estimation of key parameters such as the autocorrelation coefficient rho. The package introduces features beyond classical methods, including robust ensemble modeling via non-negative least squares optimization, post-estimation correction of negative values under multiple aggregation rules, and optional regression-based imputation of missing values through a dedicated Retropolarizer module. Architecturally, it follows a modular design inspired by scikit-learn, offering a clean API for validation, modeling, visualization, and result interpretation.

artificial intelligence, machine learning, tempdisagg, (16 more...)

arXiv.org Machine Learning

2503.22054

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.69)
Information Technology > Software (0.68)

Add feedback

Feature-Enhanced Machine Learning for All-Cause Mortality Prediction in Healthcare Data

Lee, HyeYoung, Tsoi, Pavel

arXiv.org Machine LearningMar-27-2025

Accurate patient mortality prediction enables effective risk stratification, leading to personalized treatment plans and improved patient outcomes. However, predicting mortality in healthcare remains a significant challenge, with existing studies often focusing on specific diseases or limited predictor sets. This study evaluates machine learning models for all-cause in-hospital mortality prediction using the MIMIC-III database, employing a comprehensive feature engineering approach. Guided by clinical expertise and literature, we extracted key features such as vital signs (e.g., heart rate, blood pressure), laboratory results (e.g., creatinine, glucose), and demographic information. The Random Forest model achieved the highest performance with an AUC of 0.94, significantly outperforming other machine learning and deep learning approaches. This demonstrates Random Forest's robustness in handling high-dimensional, noisy clinical data and its potential for developing effective clinical decision support tools. Our findings highlight the importance of careful feature engineering for accurate mortality prediction. We conclude by discussing implications for clinical adoption and propose future directions, including enhancing model robustness and tailoring prediction models for specific diseases.

artificial intelligence, machine learning, prediction, (13 more...)

arXiv.org Machine Learning

2503.21241

Country: North America > United States > Massachusetts (0.46)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine > Vital Signs (0.55)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.94)

Add feedback

Explainable Boosting Machine for Predicting Claim Severity and Frequency in Car Insurance

Krùpovà, Markéta, Rachdi, Nabil, Guibert, Quentin

arXiv.org Machine LearningMar-27-2025

In a context of constant increase in competition and heightened regulatory pressure, accuracy, actuarial precision, as well as transparency and understanding of the tariff, are key issues in non-life insurance. Traditionally used generalized linear models (GLM) result in a multiplicative tariff that favors interpretability. With the rapid development of machine learning and deep learning techniques, actuaries and the rest of the insurance industry have adopted these techniques widely. However, there is a need to associate them with interpretability techniques. In this paper, our study focuses on introducing an Explainable Boosting Machine (EBM) model that combines intrinsically interpretable characteristics and high prediction performance. This approach is described as a glass-box model and relies on the use of a Generalized Additive Model (GAM) and a cyclic gradient boosting algorithm. It accounts for univariate and pairwise interaction effects between features and provides naturally explanations on them. We implement this approach on car insurance frequency and severity data and extensively compare the performance of this approach with classical competitors: a GLM, a GAM, a CART model and an Extreme Gradient Boosting (XGB) algorithm. Finally, we examine the interpretability of these models to capture the main determinants of claim costs.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

2503.21321

Country:

Europe (0.93)
North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Banking & Finance > Insurance (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Simulation-informed deep learning for enhanced SWOT observations of fine-scale ocean dynamics

Cutolo, Eugenio, Granero-Belinchon, Carlos, Thiraux, Ptashanna, Wang, Jinbo, Fablet, Ronan

arXiv.org Machine LearningMar-27-2025

Oceanic processes at fine scales are crucial yet difficult to observe accurately due to limitations in satellite and in-situ measurements. The Surface Water and Ocean Topography (SWOT) mission provides high-resolution Sea Surface Height (SSH) data, though noise patterns often obscure fine scale structures. Current methods struggle with noisy data or require extensive supervised training, limiting their effectiveness on real-world observations. We introduce SIMPGEN (Simulation-Informed Metric and Prior for Generative Ensemble Networks), an unsupervised adversarial learning framework combining real SWOT observations with simulated reference data. SIMPGEN leverages wavelet-informed neural metrics to distinguish noisy from clean fields, guiding realistic SSH reconstructions. Applied to SWOT data, SIMPGEN effectively removes noise, preserving fine-scale features better than existing neural methods. This robust, unsupervised approach not only improves SWOT SSH data interpretation but also demonstrates strong potential for broader oceanographic applications, including data assimilation and super-resolution.

data quality, machine learning, simpgen, (16 more...)

arXiv.org Machine Learning

2503.21303

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry: Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Quality (0.89)

Add feedback