Random forest - impute or remove NA values? Which is the better approach? • /r/MachineLearning
Can you reduce the parameter space at all (using PCA or something similar)? This would probably improve your results when removing the NAs. Are the NA values present in every dimension? If there are only a couple of dimensions with NAs, try to train without them and see what happens. What does your data represent, and why are there NAs? Depending on what your data corresponds to it may make more or less sense to use imputation.
May-31-2016, 22:16:47 GMT
- Technology: