Variable Reduction: An art as well as Science
Variable reduction is a crucial step for accelerating model building without losing the potential predictive power of the data. With the advent of Big Data and sophisticated data mining techniques, the number of variables encountered is often tremendous making variable selection or dimension reduction techniques imperative to produce models with acceptable accuracy and generalization. The temptation to build an ecological model using all available information (i.e., all variables) is hard to resist. Ample time and money are exhausted gathering data and supporting information. Analytical limitations require us to think carefully about the variables we choose to model, rather than adopting a naive approach where we blindly use all information to understand complexity. The purpose of this post is to illustrate the use of some techniques to effectively manage the selection of explanatory variables consequently leading to a parsimonious model with highest possible prediction accuracy.
Apr-23-2017, 04:47:12 GMT
- Technology: