fitted model
fastml: Guarded Resampling Workflows for Safer Automated Machine Learning in R
Korkmaz, Selcuk, Goksuluk, Dincer, Karaismailoglu, Eda
Preprocessing leakage arises when scaling, imputation, or other data-dependent transformations are estimated before resampling, inflating apparent performance while remaining hard to detect. We present fastml, an R package that provides a single-call interface for leakage-aware machine learning through guarded resampling, where preprocessing is re-estimated inside each resample and applied to the corresponding assessment data. The package supports grouped and time-ordered resampling, blocks high-risk configurations, audits recipes for external dependencies, and includes sandboxed execution and integrated model explanation. We evaluate fastml with a Monte Carlo simulation contrasting global and fold-local normalization, a usability comparison with tidymodels under matched specifications, and survival benchmarks across datasets of different sizes. The simulation demonstrates that global preprocessing substantially inflates apparent performance relative to guarded resampling. fastml matched held-out performance obtained with tidymodels while reducing workflow orchestration, and it supported consistent benchmarking of multiple survival model classes through a unified interface.
- Europe > Netherlands > South Holland > Rotterdam (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Predictive inference is free with the jackknife+-after-bootstrap
Ensemble learning is widely used in applications to make predictions in complex decision problems---for example, averaging models fitted to a sequence of samples bootstrapped from the available training data. While such methods offer more accurate, stable, and robust predictions and model estimates, much less is known about how to perform valid, assumption-lean inference on the output of these types of procedures. In this paper, we propose the jackknife+-after-bootstrap (J+aB), a procedure for constructing a predictive interval, which uses only the available bootstrapped samples and their corresponding fitted models, and is therefore free in terms of the cost of model fitting. The J+aB offers a predictive coverage guarantee that holds with no assumptions on the distribution of the data, the nature of the fitted model, or the way in which the ensemble of models are aggregated---at worst, the failure rate of the predictive interval is inflated by a factor of 2. Our numerical experiments verify the coverage and accuracy of the resulting predictive intervals on real data.
- Asia > Middle East > Jordan (0.05)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California (0.04)
Interpretable Data-Driven Ship Dynamics Model: Enhancing Physics-Based Motion Prediction with Parameter Optimization
Papandreou, Christos, Mathioudakis, Michail, Stouraitis, Theodoros, Iatropoulos, Petros, Nikitakis, Antonios, Paschalakis, Stavros, Kyriakopoulos, Konstantinos
The deployment of autonomous navigation systems on ships necessitates accurate motion prediction models tailored to individual vessels. Traditional physics-based models, while grounded in hydrodynamic principles, often fail to account for ship-specific behaviors under real-world conditions. Conversely, purely data-driven models offer specificity but lack interpretability and robustness in edge cases. This study proposes a data-driven physics-based model that integrates physics-based equations with data-driven parameter optimization, leveraging the strengths of both approaches to ensure interpretability and adaptability. The model incorporates physics-based components such as 3-DoF dynamics, rudder, and propeller forces, while parameters such as resistance curve and rudder coefficients are optimized using synthetic data. By embedding domain knowledge into the parameter optimization process, the fitted model maintains physical consistency. Validation of the approach is realized with two container ships by comparing, both qualitatively and quantitatively, predictions against ground-truth trajectories. The results demonstrate significant improvements, in predictive accuracy and reliability, of the data-driven physics-based models over baseline physics-based models tuned with traditional marine engineering practices. The fitted models capture ship-specific behaviors in diverse conditions with their predictions being, 51.6% (ship A) and 57.8% (ship B) more accurate, 72.36% (ship A) and 89.67% (ship B) more consistent.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Greece (0.04)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Transportation > Marine (0.89)
- Transportation > Freight & Logistics Services > Shipping > Container Ship (0.35)
Automated Assessment of Residual Plots with Computer Vision Models
Li, Weihao, Cook, Dianne, Tanaka, Emi, VanderPlas, Susan, Ackermann, Klaus
Plotting the residuals is a recommended procedure to diagnose deviations from linear model assumptions, such as non-linearity, heteroscedasticity, and non-normality. The presence of structure in residual plots can be tested using the lineup protocol to do visual inference. There are a variety of conventional residual tests, but the lineup protocol, used as a statistical test, performs better for diagnostic purposes because it is less sensitive and applies more broadly to different types of departures. However, the lineup protocol relies on human judgment which limits its scalability. This work presents a solution by providing a computer vision model to automate the assessment of residual plots. It is trained to predict a distance measure that quantifies the disparity between the residual distribution of a fitted classical normal linear regression model and the reference distribution, based on Kullback-Leibler divergence. From extensive simulation studies, the computer vision model exhibits lower sensitivity than conventional tests but higher sensitivity than human visual tests. It is slightly less effective on non-linearity patterns. Several examples from classical papers and contemporary data illustrate the new procedures, highlighting its usefulness in automating the diagnostic process and supplementing existing methods.
- North America > United States > Nebraska > Lancaster County > Lincoln (0.14)
- Oceania > Australia (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
Predictive inference is free with the jackknife+-after-bootstrap
Ensemble learning is widely used in applications to make predictions in complex decision problems---for example, averaging models fitted to a sequence of samples bootstrapped from the available training data. While such methods offer more accurate, stable, and robust predictions and model estimates, much less is known about how to perform valid, assumption-lean inference on the output of these types of procedures. In this paper, we propose the jackknife -after-bootstrap (J aB), a procedure for constructing a predictive interval, which uses only the available bootstrapped samples and their corresponding fitted models, and is therefore "free" in terms of the cost of model fitting. The J aB offers a predictive coverage guarantee that holds with no assumptions on the distribution of the data, the nature of the fitted model, or the way in which the ensemble of models are aggregated---at worst, the failure rate of the predictive interval is inflated by a factor of 2. Our numerical experiments verify the coverage and accuracy of the resulting predictive intervals on real data.
AR-Sieve Bootstrap for the Random Forest and a simulation-based comparison with rangerts time series prediction
Fokam, Cabrel Teguemne, Jentsch, Carsten, Lang, Michel, Pauly, Markus
The Random Forest (RF) algorithm can be applied to a broad spectrum of problems, including time series prediction. However, neither the classical IID (Independent and Identically distributed) bootstrap nor block bootstrapping strategies (as implemented in rangerts) completely account for the nature of the Data Generating Process (DGP) while resampling the observations. We propose the combination of RF with a residual bootstrapping technique where we replace the IID bootstrap with the AR-Sieve Bootstrap (ARSB), which assumes the DGP to be an autoregressive process. To assess the new model's predictive performance, we conduct a simulation study using synthetic data generated from different types of DGPs. It turns out that ARSB provides more variation amongst the trees in the forest. Moreover, RF with ARSB shows greater accuracy compared to RF with other bootstrap strategies. However, these improvements are achieved at some efficiency costs.
- Europe > Austria > Vienna (0.14)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.07)
- Europe > Germany (0.05)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.56)
Inference at the data's edge: Gaussian processes for modeling and inference under model-dependency, poor overlap, and extrapolation
Cho, Soonhong, Kim, Doeun, Hazlett, Chad
The Gaussian Process (GP) is a highly flexible non-linear regression approach that provides a principled approach to handling our uncertainty over predicted (counterfactual) values. It does so by computing a posterior distribution over predicted point as a function of a chosen model space and the observed data, in contrast to conventional approaches that effectively compute uncertainty estimates conditionally on placing full faith in a fitted model. This is especially valuable under conditions of extrapolation or weak overlap, where model dependency poses a severe threat. We first offer an accessible explanation of GPs, and provide an implementation suitable to social science inference problems. In doing so we reduce the number of user-chosen hyperparameters from three to zero. We then illustrate the settings in which GPs can be most valuable: those where conventional approaches have poor properties due to model-dependency/extrapolation in data-sparse regions. Specifically, we apply it to (i) comparisons in which treated and control groups have poor covariate overlap; (ii) interrupted time-series designs, where models are fitted prior to an event by extrapolated after it; and (iii) regression discontinuity, which depends on model estimates taken at or just beyond the edge of their supporting data.
- North America > United States > Vermont (0.04)
- North America > United States > District of Columbia (0.04)
- Health & Medicine (0.68)
- Government > Voting & Elections (0.67)
Statistical Model Criticism using Kernel Two Sample Tests
We propose an exploratory approach to statistical model criticism using maximum mean discrepancy (MMD) two sample tests. Typical approaches to model criticism require a practitioner to select a statistic by which to measure discrepancies between data and a statistical model. MMD two sample tests are instead constructed as an analytic maximisation over a large space of possible statistics and therefore automatically select the statistic which most shows any discrepancy. We demonstrate on synthetic data that the selected statistic, called the witness function, can be used to identify where a statistical model most misrepresents the data it was trained on. We then apply the procedure to real data where the models being assessed are restricted Boltzmann machines, deep belief networks and Gaussian process regression and demonstrate the ways in which these models fail to capture the properties of the data they are trained on.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)