Khan, Mohammad Mahmudur Rahman
Implementation of Fuzzy C-Means and Possibilistic C-Means Clustering Algorithms, Cluster Tendency Analysis and Cluster Validation
Siddique, Md. Abu Bakr, Arif, Rezoana Bente, Khan, Mohammad Mahmudur Rahman, Ashrafi, Zahidun
Abstract-- In this paper, several two-dimensional clustering scenarios are given. In those scenarios, soft partitioning clustering algorithms (Fuzzy C-means (FCM) and Possibilistic c-means (PCM)) are applied. Afterward, VAT is used to investigate the clustering tendency visually, and then in order of checking cluster validation, three types of indices (e.g., PC, DI, and DBI) were used. After observing the clustering algorithms, it was evident that each of them has its limitations; however, PCM is more robust to noise than FCM as in case of FCM a noise point has to be considered as a member of any of the cluster. The clustering [1-3] is a subfield of data mining technique and it is very effective to pick out useful information from dataset.
ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities
Khan, Mohammad Mahmudur Rahman, Siddique, Md. Abu Bakr, Arif, Rezoana Bente, Oishe, Mahjabin Rahman
Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm which has the high-performance rate for dataset where clusters have the constant density of data points. One of the significant attributes of this algorithm is noise cancellation. However, DBSCAN demonstrates reduced performances for clusters with different densities. Therefore, in this paper, an adaptive DBSCAN is proposed which can work significantly well for identifying clusters with varying densities.
Study and Observation of the Variations of Accuracies for Handwritten Digits Recognition with Various Hidden Layers and Epochs using Convolutional Neural Network
Arif, Rezoana Bente, Siddique, Md. Abu Bakr, Khan, Mohammad Mahmudur Rahman, Oishe, Mahjabin Rahman
Nowadays, deep learning can be employed to a wide ranges of fields including medicine, engineering, etc. In deep learning, Convolutional Neural Network (CNN) is extensively used in the pattern and sequence recognition, video analysis, natural language processing, spam detection, topic categorization, regression analysis, speech recognition, image classification, object detection, segmentation, face recognition, robotics, and control. The benefits associated with its near human level accuracies in large applications lead to the growing acceptance of CNN in recent years. The primary contribution of this paper is to analyze the impact of the pattern of the hidden layers of a CNN over the overall performance of the network. To demonstrate this influence, we applied neural network with different layers on the Modified National Institute of Standards and Technology (MNIST) dataset. Also, is to observe the variations of accuracies of the network for various numbers of hidden layers and epochs and to make comparison and contrast among them. The system is trained utilizing stochastic gradient and backpropagation algorithm and tested with feedforward algorithm.
Study and Observation of the Variation of Accuracies of KNN, SVM, LMNN, ENN Algorithms on Eleven Different Datasets from UCI Machine Learning Repository
Khan, Mohammad Mahmudur Rahman, Arif, Rezoana Bente, Siddique, Md. Abu Bakr, Oishe, Mahjabin Rahman
Machine learning qualifies computers to assimilate with data, without being solely programmed [1, 2]. Machine learning can be classified as supervised and unsupervised learning. In supervised learning, computers learn an objective that portrays an input to an output hinged on training input-output pairs [3]. Most efficient and widely used supervised learning algorithms are K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Large Margin Nearest Neighbor (LMNN), and Extended Nearest Neighbor (ENN). The main contribution of this paper is to implement these elegant learning algorithms on eleven different datasets from the UCI machine learning repository to observe the variation of accuracies for each of the algorithms on all datasets. Analyzing the accuracy of the algorithms will give us a brief idea about the relationship of the machine learning algorithms and the data dimensionality. All the algorithms are developed in Matlab. Upon such accuracy observation, the comparison can be built among KNN, SVM, LMNN, and ENN regarding their performances on each dataset.