Three-way data splits (training, test and validation) for model selection and performance estimation
The use of training, validation and test datasets is common but not easily understood. In this post, I attempt to clarify this concept. The post is part of my forthcoming book on learning Artificial Intelligence, Machine Learning and Deep Learning based on high school maths. And then comes up with an important statement: Reference to a "validation dataset" disappears if the practitioner is choosing to tune model hyperparameters using k-fold cross-validation with the training dataset. Model selection: involves selecting optimal parameters or a model.
Sep-17-2019, 17:00:54 GMT
- Technology: