An integrated perspective of robustness in regression through the lens of the bias-variance trade-off

Okuno, Akifumi

arXiv.org Machine Learning 

The concept of robustness is of paramount importance across a variety of fields, particularly those involving practical statistical parameter estimation based on real-world observations. However, robust estimation techniques introduced in various methodologies aim to achieve different objectives, and each technique has been examined within individual frameworks. It is crucial to reexamine the purpose behind robust estimation and provide an integrated perspective across disciplinary boundaries. To facilitate this, this study initially classifies the goals of robust estimation methods into three categories: resistance to (1) outlier contamination (see, e.g., Huber and Ronchetti (1981) and Hampel et al. (1986)), (2) user-specified imaginary dataset-perturbation (see, e.g., Ben-Tal and Nemirovski (2002) and Biggio et al. (2013)), and (3) model misspecification. Notably, (3) can be addressed using expressive models in certain cases; (3) will be discussed later but will not be the main focus. Therefore, this study primarily focuses on the following two categories within the context of linear regression: (1) Outlier-resistance. Outliers are data points that deviate significantly from the overall trend of the other observations in a dataset. Since the presence of outliers can affect statistical parameter estimation, potentially leading to unintended results, outlier-resistant estimation has been a focus for many decades (Huber and Ronchetti, 1981; Hampel et al., 1986; Maronna et al., 2006) mainly in the field of statistics. Originating from the works of Tukey (1960) and Huber (1964), many outlier-resistant estimations are designed by modifying the loss function.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found