A visual explanation for regularization of linear models

#artificialintelligence 

Personally, my biggest initial stumbling block was this: The math used to implement regularization does not correspond to pictures commonly used to explain regularization. Take a look at the oft-copied picture (shown below left) from page 71 of ESL in the section on "Shrinkage Methods." Students see this multiple times in their careers but have trouble mapping that to the relatively straightforward mathematics used to regularize linear model training. The simple reason is that that illustration shows how we regularize models conceptually, with hard constraints, not how we actually implement regularization, with soft constraints! Regularization conceptually uses a hard constraint to prevent coefficients from getting too large (the cyan circles from the ESL picture).