So I understand that variable selection is a part of model selection. But what exactly does model selection consist of? I ask this because I am reading an article Burnham & Anderson: AIC vs BIC where they talk about AIC and BIC in model selection. Reading this article I realize I have been thinking of'model selection' as'variable selection' (ref. An excerpt from the article where they talk about 12 models with increasing degrees of "generality" and these models show "tapering effects" (Figure 1) when KL-Information is plotted against the 12 models: DIFFERENT PHILOSOPHIES AND TARGET MODELS ... Despite that the target of BIC is a more general model than the target model for AIC, the model most often selected here by BIC will be less general than Model 7 unless n is very large.