This paper proposes a neuro-rough model based on multi-layered perceptron and rough set. The neuro-rough model is then tested on modelling the risk of HIV from demographic data. The model is formulated using Bayesian framework and trained using Monte Carlo method and Metropolis criterion. When the model was tested to estimate the risk of HIV infection given the demographic data it was found to give the accuracy of 62%. The proposed model is able to combine the accuracy of the Bayesian MLP model and the transparency of Bayesian rough set model.
This paper proposes an approach to training rough set models using Bayesian framework trained using Markov Chain Monte Carlo (MCMC) method. The prior probabilities are constructed from the prior knowledge that good rough set models have fewer rules. Markov Chain Monte Carlo sampling is conducted through sampling in the rough set granule space and Metropolis algorithm is used as an acceptance criteria. The proposed method is tested to estimate the risk of HIV given demographic data. The results obtained shows that the proposed approach is able to achieve an average accuracy of 58% with the accuracy varying up to 66%. In addition the Bayesian rough set give the probabilities of the estimated HIV status as well as the linguistic rules describing how the demographic parameters drive the risk of HIV.
Data collection often results in records that have missing values or variables. This investigation compares 3 different data imputation models and identifies their merits by using accuracy measures. Autoencoder Neural Networks, Principal components and Support Vector regression are used for prediction and combined with a genetic algorithm to then impute missing variables. The use of PCA improves the overall performance of the autoencoder network while the use of support vector regression shows promising potential for future investigation. Accuracies of up to 97.4 % on imputation of some of the variables were achieved.
Medical diagnosis process vary in the degree to which they attempt to deal with different complicating aspects of diagnosis such as relative importance of symptoms, varied symptom pattern and the relation between diseases them selves. Based on decision theory, in the past many mathematical models such as crisp set, probability distribution, fuzzy set, intuitionistic fuzzy set were developed to deal with complicating aspects of diagnosis. But, many such models are failed to include important aspects of the expert decisions. Therefore, an effort has been made to process inconsistencies in data being considered by Pawlak with the introduction of rough set theory. Though rough set has major advantages over the other methods, but it generates too many rules that create many difficulties while taking decisions. Therefore, it is essential to minimize the decision rules. In this paper, we use two processes such as pre process and post process to mine suitable rules and to explore the relationship among the attributes. In pre process we use rough set theory to mine suitable rules, whereas in post process we use formal concept analysis from these suitable rules to explore better knowledge and most important factors affecting the decision making.
The paper presents the investigation and implementation of the relationship between diversity and the performance of multiple classifiers on classification accuracy. The study is critical as to build classifiers that are strong and can generalize better. The parameters of the neural network within the committee were varied to induce diversity; hence structural diversity is the focus for this study. The hidden nodes and the activation function are the parameters that were varied. The diversity measures that were adopted from ecology such as Shannon and Simpson were used to quantify diversity. Genetic algorithm is used to find the optimal ensemble by using the accuracy as the cost function. The results observed shows that there is a relationship between structural diversity and accuracy. It is observed that the classification accuracy of an ensemble increases as the diversity increases. There was an increase of 3%-6% in the classification accuracy.