Heart disease could mean range of different conditions that could affect your heart. It is one of the most complex disease to predict given number of factors in your body that can potentially lead to it. Identifying and predicting it poses a great deal of challenge for doctors and researchers alike. I will attempt to take a stab at this problem using machine learning with the public dataset thats made available here at UCI Machine Learning Repository. There are 303 records in the dataset and contains 14 continuous attributes.
Associative classification is a recent and rewarding technique which integrates association rule mining and classification to a model for prediction and achieves maximum accuracy. Associative classifiers are especially fit to applications where maximum accuracy is desired to a model for prediction. There are many domains such as medical where the maximum accuracy of the model is desired. Heart disease is a single largest cause of death in developed countries and one of the main contributors to disease burden in developing countries. Mortality data from the registrar general of India shows that heart disease are a major cause of death in India, and in Andhra Pradesh coronary heart disease cause about 30%of deaths in rural areas. Hence there is a need to develop a decision support system for predicting heart disease of a patient. In this paper we propose efficient associative classification algorithm using genetic approach for heart disease prediction. The main motivation for using genetic algorithm in the discovery of high level prediction rules is that the discovered rules are highly comprehensible, having high predictive accuracy and of high interestingness values. Experimental Results show that most of the classifier rules help in the best prediction of heart disease which even helps doctors in their diagnosis decisions.
Heart disease is the leading cause of death, and experts estimate that approximately half of all heart attacks and strokes occur in people who have not been flagged as "at risk." Thus, there is an urgent need to improve the accuracy of heart disease diagnosis. To this end, we investigate the potential of using data analysis, and in particular the design and use of deep neural networks (DNNs) for detecting heart disease based on routine clinical data. Our main contribution is the design, evaluation, and optimization of DNN architectures of increasing depth for heart disease diagnosis. This work led to the discovery of a novel five layer DNN architecture - named Heart Evaluation for Algorithmic Risk-reduction and Optimization Five (HEARO-5) -- that yields best prediction accuracy. HEARO-5's design employs regularization optimization and automatically deals with missing data and/or data outliers. To evaluate and tune the architectures we use k-way cross-validation as well as Matthews correlation coefficient (MCC) to measure the quality of our classifications. The study is performed on the publicly available Cleveland dataset of medical information, and we are making our developments open source, to further facilitate openness and research on the use of DNNs in medicine. The HEARO-5 architecture, yielding 99% accuracy and 0.98 MCC, significantly outperforms currently published research in the area.
Improving the precision of heart diseases detection has been investigated by many researchers in the literature. Such improvement induced by the overwhelming health care expenditures and erroneous diagnosis. As a result, various methodologies have been proposed to analyze the disease factors aiming to decrease the physicians practice variation and reduce medical costs and errors. In this paper, our main motivation is to develop an effective intelligent medical decision support system based on data mining techniques. In this context, five data mining classifying algorithms, with large datasets, have been utilized to assess and analyze the risk factors statistically related to heart diseases in order to compare the performance of the implemented classifiers (e.g., Na\"ive Bayes, Decision Tree, Discriminant, Random Forest, and Support Vector Machine). To underscore the practical viability of our approach, the selected classifiers have been implemented using MATLAB tool with two datasets. Results of the conducted experiments showed that all classification algorithms are predictive and can give relatively correct answer. However, the decision tree outperforms other classifiers with an accuracy rate of 99.0% followed by Random forest. That is the case because both of them have relatively same mechanism but the Random forest can build ensemble of decision tree. Although ensemble learning has been proved to produce superior results, but in our case the decision tree has outperformed its ensemble version.
Medical diagnosis process vary in the degree to which they attempt to deal with different complicating aspects of diagnosis such as relative importance of symptoms, varied symptom pattern and the relation between diseases them selves. Based on decision theory, in the past many mathematical models such as crisp set, probability distribution, fuzzy set, intuitionistic fuzzy set were developed to deal with complicating aspects of diagnosis. But, many such models are failed to include important aspects of the expert decisions. Therefore, an effort has been made to process inconsistencies in data being considered by Pawlak with the introduction of rough set theory. Though rough set has major advantages over the other methods, but it generates too many rules that create many difficulties while taking decisions. Therefore, it is essential to minimize the decision rules. In this paper, we use two processes such as pre process and post process to mine suitable rules and to explore the relationship among the attributes. In pre process we use rough set theory to mine suitable rules, whereas in post process we use formal concept analysis from these suitable rules to explore better knowledge and most important factors affecting the decision making.