Dealing with Outliers Using Three Robust Linear Regression Models
Roughly 10% of data was identified as outliers and all the observations introduced were correctly classified as outliers. Them, we can quickly visualize the inliers compared to outliers to see the remaining 26 observations flagged as outliers. Figure 5 shows that the observations located farthest from the hypothetical best-fit line of the original data are considered outliers. The last of the robust regression algorithms available in scikit-learn is the Theil-Sen regression. It is a non-parametric regression method, which means that it makes no assumption about the underlying data distribution. In short, it involves fitting multiple regression models on subsets of the training data and then aggregating the coefficients at the last step.
Aug-31-2022, 15:06:56 GMT
- Technology: