ledell/useR-machine-learning-tutorial
Instructions for how to install the neccessary software for this tutorial is available here. Data for the tutorial can be downloaded by running ./data/get-data.sh (requires wget). Certain algorithms don't scale well when there are millions of features. For example, decision trees require computing some sort of metric (to determine the splits) on all the feature values (or some fraction of the values as in Random Forest and Stochastic GBM). Therefore, computation time is linear in the number of features. Algorithms can deal with data sparsity (where many of the feature values are zero) in different ways.
Jul-1-2016, 11:11:03 GMT
- Genre:
- Technology: