Goto

Collaborating Authors

 anova table


Blocked Cross-Validation: A Precise and Efficient Method for Hyperparameter Tuning

arXiv.org Artificial Intelligence

Hyperparameter tuning plays a crucial role in optimizing the performance of predictive learners. Cross--validation (CV) is a widely adopted technique for estimating the error of different hyperparameter settings. Repeated cross-validation (RCV) has been commonly employed to reduce the variability of CV errors. In this paper, we introduce a novel approach called blocked cross-validation (BCV), where the repetitions are blocked with respect to both CV partition and the random behavior of the learner. Theoretical analysis and empirical experiments demonstrate that BCV provides more precise error estimates compared to RCV, even with a significantly reduced number of runs. We present extensive examples using real--world data sets to showcase the effectiveness and efficiency of BCV in hyperparameter tuning. Our results indicate that BCV outperforms RCV in hyperparameter tuning, achieving greater precision with fewer computations.


Behavior of Hyper-Parameters for Selected Machine Learning Algorithms: An Empirical Investigation

arXiv.org Artificial Intelligence

Hyper-parameters (HPs) are an important part of machine learning (ML) model development and can greatly influence performance. This paper studies their behavior for three algorithms: Extreme Gradient Boosting (XGB), Random Forest (RF), and Feedforward Neural Network (FFNN) with structured data. Our empirical investigation examines the qualitative behavior of model performance as the HPs vary, quantifies the importance of each HP for different ML algorithms, and stability of the performance near the optimal region. Based on the findings, we propose a set of guidelines for efficient HP tuning by reducing the search space.


Reduced regression models and tests for linear hypotheses

#artificialintelligence

On a SAS discussion forum, a statistical programmer asked about how to understand the statistics that are displayed when you use the TEST statement in PROC REG (or other SAS regression procedures) to test for linear relationships between regression coefficients. The documentation for the TEST statement in PROC REG explains the F test in terms of a matrix of linear constraints. However, the programmer wanted a simpler explanation. Fortunately, there is an easy way to explain the TEST statement from first principles. The explanation involves running two regression models.


Dissecting 1-Way ANOVA and ANCOVA with Examples in R

#artificialintelligence

ANOVA (Analysis of Variance) is a process to compare the means of more than two groups. It can also be used for comparing the means of two groups. Comparing the means between two groups only can be done using a hypothesis testing method such as a t-test. This article will focus on comparing the means of more than two groups using the Analysis of Variance (ANOVA) method. This method breaks down the overall variability of a given continuous outcome into pieces. One way ANOVA is applicable where groups are defined based on the value of one factor.