Gerela, Pinky
Binary Classification for High Dimensional Data using Supervised Non-Parametric Ensemble Method
Kanvinde, Nandan, Gupta, Abhishek, Joshi, Raunak, Gerela, Pinky
High dimensional data for classification does create many difficulties for machine learning algorithms. The generalization can be done using ensemble learning methods such as bagging based supervised non-parametric random forest algorithm. In this paper we solve the problem of binary classification for high dimensional data using random forest for polycystic ovary syndrome dataset. We have performed the implementation and provided a detailed visualization of the data for general inference. The training accuracy that we have achieved is 95.6% and validation accuracy over 91.74% respectively.
Metric Effects based on Fluctuations in values of k in Nearest Neighbor Regressor
Gupta, Abhishek, Joshi, Raunak, Kanvinde, Nandan, Gerela, Pinky, Laban, Ronald Melwin
Regression branch of Machine Learning purely focuses on prediction of continuous values. The supervised learning branch has many regression based methods with parametric and non-parametric learning models. In this paper we aim to target a very subtle point related to distance based regression model. The distance based model used is K-Nearest Neighbors Regressor which is a supervised non-parametric method. The point that we want to prove is the effect of k parameter of the model and its fluctuations affecting the metrics. The metrics that we use are Root Mean Squared Error and R-Squared Goodness of Fit with their visual representation of values with respect to k values.