Machine Learning for Unbalanced Datasets using Neural Networks
There are a few ways to address unbalanced datasets: from built-in class_weight in a logistic regression and sklearn estimators to manual oversampling, and SMOTE. We will look at whether neural networks can serve as a reliable out-of-the-box solution and what parameters can be tweaked to achieve a better performance. Code is available on GitHub. We'll use the Framingham Heart Study data set from Kaggle for this exercise. It presents a binary classification problem in which we need to predict a value of the variable "TenYearCHD" (zero or one) that shows whether a patient will develop a heart disease.
Sep-23-2019, 12:15:18 GMT