Explainable Ensemble Learning Models for Early Detection of Heart Disease

Raed Hassan Laftah, Karim Hashim Kraidi Al-Saedi

Abstract


Coronary diseases (CVD) are a major global health concern, and timely, accurate diagnosis is crucial for effective treatment and management. As machine learning is, has steadily been on the improvement way, and it's there where we find the transformative potential for enhancing the diagnostic accuracy for their predictive accuracy using the Local Interpretable Model-agnostic Explanations technique to ensure the explainability of our models. With the advancement of machine learning, we aim to enhance diagnostic accuracy by developing a high-precision prediction tool for heart disease using various ML models. We utilized a Kaggle dataset to implement several ML models, including Random Forest, Gradient Boosting, CatBoost, K-Nearest Neighbor, Naive Bayes, Support Vector Machine, and AdaBoost, with appropriate data preprocessing. The soft voting ensemble method, combining various models, achieved a notable 98.54% accuracy and 99% precision, recall, and f1-score, with Random Forest, CatBoost, and the Voting Classifier outperforming others. These results indicate that our model is highly reliable and sets a new standard for CVD prediction. Future research should focus on validating this model with larger datasets and exploring deep learning approaches.

Keywords


Cardiovascular Disease Prediction; Machine Learning; Ensemble Learning; Soft Voting Classifier.

Full Text:

PDF

References


M. D. Seckeler and T. R. Hoke, “The worldwide epidemiology of acute rheumatic fever and rheumatic heart disease,” Clinical epidemiology, pp. 67–84, 2011.

K. Saxena et al., “Efficient heart disease prediction system,” Procedia Computer Science, vol. 85, pp. 962–969, 2016.

C. S. Dangare and S. S. Apte, “Improved study of heart disease prediction system using data mining classification techniques,” International Journal of Computer Applications, vol. 47, no. 10, pp. 44–48, 2012.

V. Shorewala, “Early detection of coronary heart disease using ensemble techniques,” Informatics in Medicine Unlocked, vol. 26, p. 100655, 2021.

H. N. Murthy and M. Meenakshi, “Dimensionality reduction using neuro-genetic approach for early prediction of coronary heart disease,” in International conference on circuits, communication, control and computing, pp. 329–332, 2014.

E. J. Benjamin et al., “Heart disease and stroke statistics—2019 update: a report from the american heart association,” Circulation, vol. 139, no. 10, pp. e56–e528, 2019.

D. Mozaffarian et al., “Heart disease and stroke statistics—2016 update: a report from the american heart association,” circulation, vol. 133, no. 4, pp. e38– e360, 2016.

J. Maiga et al., “Comparison of machine learning models in prediction of cardiovascular disease using health record data,” in 2019 international conference on informatics, multimedia, cyber and information system (ICIMCIS), pp. 45–48, 2019.

J. Soni et al., “Predictive data mining for medical diagnosis: An overview of heart disease prediction,” International Journal of Computer Applications, vol. 17, no. 8, pp. 43–48, 2011.

S. F. Weng, J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi, “Can machine-learning improve cardiovascular risk prediction using routine clinical data?,” PloS one, vol. 12, no. 4, p. e0174944, 2017.

V. Ramalingam, A. Dandapath, and M. K. Raja, “Heart disease prediction using machine learning techniques: a survey,” International Journal of Engineering & Technology, vol. 7, no. 2.8, pp. 684–687, 2018.

S. Mohan, C. Thirumalai, and G. Srivastava, “Effective heart disease prediction using hybrid machine learning techniques,” IEEE access, vol. 7, pp. 81 542–81 554, 2019.

M. Nabeel, S. Majeed, M. J. Awan, H. Muslih-ud Din, M. Wasique, and R. Nasir, “Review on effective disease prediction through data mining techniques,” International Journal on Electrical Engineering & Informatics, vol. 13, no. 3, 2021.

T. Ramesh, U. K. Lilhore, M. Poongodi, S. Simaiya, A. Kaur, and M. Hamdi, “Predictive analysis of heart diseases with machine learning approaches,” Malaysian Journal of Computer Science, pp. 132–148, 2022.

F. S. Alotaibi, “Implementation of machine learning model to predict heart failure disease,” International Journal of Advanced Computer Science and Applications, vol. 10, no. 6, 2019.

D. Shah, S. Patel, and S. K. Bharti, “Heart disease prediction using machine learning techniques,” SN Computer Science, vol. 1, pp. 1–6, 2020.

A. I. Shimaa Ouf, “A proposed paradigm for intelligent heart disease prediction system using data mining techniques,” Journal of Southwest Jiaotong University, vol. 56, no. 4, 2021.

K. Drozd˙z et al., “Risk factors for cardiovascular disease in patients with metabolic-associated fatty liver disease: a machine learning approach,” Cardiovascular Diabetology, vol. 21, no. 1, p. 240, 2022.

C. Boukhatem, H. Y. Youssef, and A. B. Nassif, “Heart disease prediction using machine learning,” in 2022 Advances in Science and Engineering Technology International Conferences (ASET), pp. 1–6, 2022.

N. Chandrasekhar and S. Peddakrishna, “Enhancing heart disease prediction accuracy through machine learning techniques and optimization,” Processes, vol. 11, no. 4, p. 1210, 2023.

A. Khan et al., “A novel study on machine learning algorithm-based cardiovascular disease prediction,” Health & Social Care in the Community, vol. 2023, 2023.

P. C. Bizimana et al., “An effective machine learning-based model for an early heart disease prediction,” BioMed Research International, vol. 2023, 2023.

M. T. Ribeiro, S. Singh, and C. Guestrin, “Anchors: High-precision model-agnostic explanations,” in Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018.

M. S. Eichenbaum, S. Rebelo, and M. Trabandt, "The macroeconomics of epidemics," The Review of Financial Studies, vol. 34, no. 11, pp. 5149-5187, 2021.

Y. Ismahene, “Infectious diseases, trade, and economic growth: a panel analysis of developed and developing countries,” Journal of the Knowledge Economy, vol. 13, no. 3, pp. 2547-2583, 2022.

T. Chatzinikolaou, E. Vogiatzi, A. Kousis, and C. Tjortjis, “Smart healthcare support using data mining and machine learning,” in IoT and WSN based Smart Cities: A Machine Learning Perspective, pp. 27-48, 2022.

A. S. Albahri et al., “Role of biological data mining and machine learning techniques in detecting and diagnosing the novel coronavirus (COVID-19): a systematic review,” Journal of medical systems, vol. 44, pp. 1-11, 2020.

M. Rana and M. Bhushan, “Machine learning and deep learning approach for medical image analysis: diagnosis to detection,” Multimedia Tools and Applications, vol. 82, no. 17, pp. 26731-26769, 2023.

M. Rahman, “Artificial Intelligence (AI) and Machine Learning (ML),” in Medical Imaging Informatics towards Diagnostic Decision Making, p. 238, 2023.

P. C. Bizimana, Z. Zhang, M. Asim, A. A. A. El-Latif, and M. Hammad, “Learning-based techniques for heart disease prediction: a survey of models and performance metrics,” Multimedia Tools and Applications, pp. 1-55, 2023.

M. Badawy, N. Ramadan, and H. A. Hefny, “Healthcare predictive analytics using machine learning and deep learning techniques: a survey,” Journal of Electrical Systems and Information Technology, vol. 10, no. 1, p. 40, 2023.

J. Ouwerkerk et al., “Machine learning in Huntington’s disease: exploring the Enroll-HD dataset for prognosis and driving capability prediction,” Orphanet Journal of Rare Diseases, vol. 18, no. 1, p. 218, 2023.

S. D. Prakoso, A. E. Permanasari, and A. R. Pratama, “Heart Disease Prediction Using Machine Learning: A Systematic Literature Review,” in 2023 10th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), pp. 155-159, 2023.

M. Fan, X. Peng, X. Niu, T. Cui, and Q. He, “Missing data imputation, prediction, and feature selection in diagnosis of vaginal prolapse,” BMC Medical Research Methodology, vol. 23, no. 1, p. 259, 2023.

L. Hu, X. Cheng, C. Wen, and Y. Ren, “Medical prediction from missing data with max-minus negative regularized dropout,” Frontiers in Neuroscience, vol. 17, p. 1221970, 2023.

D. L. Olson and Ö. M. Araz, “Applications of Predictive Data Mining in Healthcare,” in Data Mining and Analytics in Healthcare Management: Applications and Tools, pp. 105-116, 2023.

S. G. Kanakaraddi, K. C. Gull, J. Bali, A. K. Chikaraddi, and S. Giraddi, “Disease prediction using data mining and machine learning techniques,” Advanced Prognostic Predictive Modelling in Healthcare Data Analytics, pp. 71-92, 2021.

A. Agresti. Categorical data analysis. John Wiley & Sons, vol. 792, 2012.

B. Shahbaba and B. Shahbaba, “Statistical Inference for the Relationship Between Two Variables,” Biostatistics with R: An Introduction to Statistics Through Biological Data, pp. 193-219, 2012.

L. H. Nazer et al., “Bias in artificial intelligence algorithms and recommendations for mitigation,” PLOS digital health, vol. 2, no. 6, p. e0000278, 2023.

L. Belenguer, “AI bias: exploring discriminatory algorithmic decision-making models and the application of possible machine-centric solutions adapted from the pharmaceutical industry,” AI and Ethics, vol. 2, no. 4, pp. 771-787, 2022.

H. Liu and M. Cocea, “Semi-random partitioning of data into training and test sets in granular computing context,” Granular Computing, vol. 2, pp. 357-386, 2017.

K. Korjus, M. N. Hebart, and R. Vicente, “An efficient data partitioning to improve classification performance while keeping parameters interpretable,” PloS one, vol. 11, no. 8, p. e0161788, 2016.

T. P. Debray et al., “A guide to systematic review and meta-analysis of prediction model performance,” BMJ, vol. 356, 2017.

R. D. Riley et al., “Evaluation of clinical prediction models (part 2): how to undertake an external validation study,” BMJ, vol. 384, 2024.

L. Kumar, C. Anitha, V. N. Ghodke, N. Nithya, V. A. Drave, and F. Azmath, “Deep Learning Based Healthcare Method for Effective Heart Disease Prediction,” EAI Endorsed Transactions on Pervasive Health and Technology, vol. 9, pp. 1-6, 2023.

A. Ahmed Mohammed and Z. Aung, “Ensemble learning approach for probabilistic forecasting of solar power generation,” Energies, vol. 9, no. 12, p. 1017, 2016.

R. R. Sarra, A. M. Dinar, M. A. Mohammed, M. K. A. Ghani, and M. A. Albahar, “A robust framework for data generative and heart disease prediction based on efficient deep learning models,” Diagnostics, vol. 12, no. 12, p. 2899, 2022.

D. Hassan, H. I. Hussein, and M. M. Hassan, “Heart disease prediction based on pre-trained deep neural networks combined with principal component analysis,” Biomedical signal processing and control, vol. 79, p. 104019, 2023.

A. A. Mohammed, W. Yaqub, and Z. Aung, “Probabilistic forecasting of solar power: An ensemble learning approach,” in Intelligent Decision Technologies: Proceedings of the 7th KES International Conference on Intelligent Decision Technologies (KES-IDT 2015), pp. 449-458, 2015.

M. U. Salur and I. Aydın, “A soft voting ensemble learning-based approach for multimodal sentiment analysis,” Neural Computing and Applications, vol. 34, no. 21, pp. 18391-18406, 2022.

G. Wang, X. Yang, and X. Zhu, “Single Classifier Selection for Ensemble Learning,” in Advanced Data Mining and Applications: 12th International Conference, pp. 312-328, 2016.

S. Y. Hera, M. Amjad, and M. K. Saba, “Improving heart disease prediction using multi-tier ensemble model,” Network Modeling Analysis in Health Informatics and Bioinformatics, vol. 11, no. 1, p. 41, 2022.

S. Praveen and K. Joshi, “Explainable Artificial Intelligence in Health Care: How XAI Improves User Trust in High-Risk Decisions,” in Explainable Edge AI: A Futuristic Computing Perspective, pp. 89-99, 2022.

C. Manresa-Yee, M. F. Roig-Maimó, S. Ramis, and R. Mas-Sansó, “Advances in XAI: Explanation interfaces in healthcare,” in Handbook of Artificial Intelligence in Healthcare: Vol 2: Practicalities and Prospects, pp. 357-369, 2021.

J. Gupta and K. R. Seeja, “A Comparative Study and Systematic Analysis of XAI Models and their Applications in Healthcare,” Archives of Computational Methods in Engineering, pp. 1-26, 2024.

C. Metta et al., “Improving trust and confidence in medical skin lesion diagnosis through explainable deep learning,” International Journal of Data Science and Analytics, pp. 1-13, 2023.

R. Lukyanenko, W. Maass, and V. C. Storey, “Trust in artificial intelligence: From a Foundational Trust Framework to emerging research opportunities,” Electronic Markets, vol. 32, no. 4, pp. 1993-2020, 2022.

Henriques, A., Parola, H., Gonçalves, R., & Rodrigues, M. (2024, March). Integrating Explainable AI: Breakthroughs in Medical Diagnosis and Surgery. In World Conference on Information Systems and Technologies (pp. 254-272). Cham: Springer Nature Switzerland.

G. N. Ahmad, H. Fatima, S. Ullah, and A. S. Saidi, “Efficient medical diagnosis of human heart diseases using machine learning techniques with and without GridSearchCV,” IEEE Access, vol. 10, pp. 80151-80173, 2022.

J. P. Li, A. U. Haq, S. U. Din, J. Khan, A. Khan, and A. Saboor, “Heart disease identification method using machine learning classification in e-healthcare,” IEEE Access, vol. 8, pp. 107562-107582.

M. T. Ribeiro, S. Singh, and C. Guestrin, “"Why should i trust you?" Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135-1144, 2016.

P. Besse, C. Castets-Renard, A. Garivier, and J. M. Loubes, “Can everyday ai be ethical? machine learning algorithm fairness,” Machine Learning Algorithm Fairness (May 20, 2018), vol. 6, no. 3, 2019.

K. Wang et al., “Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP,” Computers in biology and medicine, vol. 137, p. 104813, 2021.

T. A. Assegie, “Evaluation of local interpretable model-agnostic explanation and shapley additive explanation for chronic heart disease detection,” Proc Eng Technol Innov, vol. 23, pp. 48-59, 2023.

R. Goel, “Heart disease prediction using various algorithms of machine learning,” in Proceedings of the International Conference on Innovative Computing & Communication (ICICC), 2021.

A. Kilic, “Artificial intelligence and machine learning in cardiovascular health care,” The Annals of thoracic surgery, vol. 109, no. 5, pp. 1323-1329, 2020.

S. Uddin, A. Khan, M. E. Hossain, and M. A. Moni, “Comparing different supervised machine learning algorithms for disease prediction,” BMC medical informatics and decision making, vol. 19, no. 1, pp. 1-16, 2019.

P. Aparna and K. M. Sharma, “Detection of A Fib and its Classification using SVM,” in 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), pp. 116-120, 2020.

M. A. Naji, S. El Filali, M. Bouhlal, E. H. Benlahmar, R. A. Abdelouhahid, and O. Debauche, “Breast cancer prediction and diagnosis through a new approach based on majority voting ensemble classifier,” Procedia Computer Science, vol. 191, pp. 481-486, 2021.

S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,” International Journal of Cognitive Computing in Engineering, vol. 2, pp. 40-46, 2021.

N. Kumar and D. Kumar, “Machine learning based heart disease diagnosis using non-invasive methods: A review,” in Journal of Physics: Conference Series, vol. 1950, no. 1, p. 012081, 2021.

Y. Muhammad, M. Tahir, M. Hayat, and K. T. Chong, “Early and accurate detection and diagnosis of heart disease using intelligent computational model,” Scientific reports, vol. 10, no. 1, p. 19747, 2020.

P. K. Misra, N. Kumar, A. Misra, and A. Khang, “Heart disease prediction using logistic regression and random forest classifier,” in Data-Centric AI Solutions and Emerging Technologies in the Healthcare Ecosystem, pp. 83-112, 2023.

A. K. Chaudhuri, S. Das, and A. Ray, “An Improved Random Forest Model for Detecting Heart Disease,” in Data-Centric AI Solutions and Emerging Technologies in the Healthcare Ecosystem, pp. 143-164, 2024.

N. Samal, M. Kaur, R. K. Singhal, and J. S. P. Singh, “Heart Disease Analysis Research Using K-Nearest Neighbor: a Review,” EasyChair Preprint, 2023.

M. Maydanchi, “Comparative Study of decision tree, adaboost, random forest, Naïve Bayes, KNN, and perceptron for heart disease prediction,” in SoutheastCon 2023, pp. 204-208, 2023.

T. S. Patel, D. P. Patel, M. Sanyal, and P. S. Shrivastav, “Prediction of heart disease and survivability using support vector machine and Naive Bayes algorithm,” bioRxiv, 2023.

T. O. Omotehinwa, D. O. Oyewola, and E. G. Moung, “Optimizing the light gradient-boosting machine algorithm for an efficient early detection of coronary heart disease,” Informatics and Health, vol. 1, no. 2, pp. 70-81, 2024.

R. Suhendra et al., “Cardiovascular Disease Prediction Using Gradient Boosting Classifier,” Infolitika Journal of Data Science, vol. 1, no. 2, pp. 56-62, 2023.

F. Ahmed, M. Saleem, Z. Rajpoot, and A. Noor, “Intelligent Heart Disease Prediction Using CatBoost Empowered with XAI,” International Journal of Computational and Innovative Sciences, vol. 2, no. 4, pp. 8-13, 2023.

H. Singh, T. Gupta, and J. Sidhu, “Prediction of heart disease using machine learning techniques,” in 2021 Sixth International Conference on Image Information Processing (ICIIP), vol. 6, pp. 164-169, 2021.




DOI: https://doi.org/10.18196/jrc.v5i5.22448

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Raed Hassan Laftah, Karim Hashim Kraidi Al-Saedi

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

 


Journal of Robotics and Control (JRC)

P-ISSN: 2715-5056 || E-ISSN: 2715-5072
Organized by Peneliti Teknologi Teknik Indonesia
Published by Universitas Muhammadiyah Yogyakarta in collaboration with Peneliti Teknologi Teknik Indonesia, Indonesia and the Department of Electrical Engineering
Website: http://journal.umy.ac.id/index.php/jrc
Email: jrcofumy@gmail.com


Kuliah Teknik Elektro Terbaik