Noob question: why should we normalize test data with mean and std from training data? • /r/MachineLearning

Jun-6-2016, 20:05:45 GMT–#artificialintelligence

Nah. It's only really required for things like Neural Networks where it keeps the gradient descent of features in the space where gradient descent does best, and for Linear/Logistic Regression where it also isn't really required, but makes the weights interpretable as feature importance/contribution to the prediction. For things like Random Forest, which are based on decision trees, they'll find a split anywhere, it doesn't matter how the features are scaled. For stuff like Nearest Neighbours, it can be important, or it can hurt. This is because normalisation is like saying all features are equally important, which isn't necessarily true. It could be the case that you've got spatial information in a rectangular space, and so normalising is favouring the small axis of that rectangle over the other axis.

artificial intelligence, machine learning, normalize test data, (8 more...)

#artificialintelligence

Jun-6-2016, 20:05:45 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (0.92)
  - Decision Tree Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found