Enhanced Stacked Ensemble-Based Heart Disease Prediction with Chi-Square Feature Selection Method
DOI:
https://doi.org/10.18196/jrc.v5i6.23191Keywords:
Heart Disease, Machine Learning, Stacking Ensemble, Feature Selection, Chi-Square.Abstract
Heart disease (HD) is the primary cause of death globally, requiring more accurate, affordable diagnostic technologies. Traditional HD diagnostic methods are adequate but expensive and limited, creating a need for creative alternatives. Machine learning (ML) is one of the many sophisticated technologies healthcare systems use to predict diseases. This work aims to enhance the accuracy and efficiency of HD diagnosis by developing a stacked ensemble classifier that combines predictions from different ML classifiers and uses chi-square feature selection to prioritize significant features. Combining predictions from three basic ML classifiers—decision trees (DT), support vector machines (SVM), and multilayer perceptron (MLP)—the paper creates a stacked ensemble classifier. To raise diagnostic accuracy, this stacked ensemble classifier maximizes the strengths of base classifiers and reduces their errors. Furthermore, applying the chi-square feature selection approach, the study finds five important features for training the classifiers on the Cleveland dataset with thirteen (13) features. Selecting only important features through feature selection minimizes dimensionality, simplifies the classifier, and improves computational performance. This also reduces overfitting, increases generalizability, and speeds up diagnosis, making it more viable for real-time clinical applications. Before and following the feature selection procedure, the ensemble classifier performance is assessed against the base classifiers concerning the accuracy, recall, precision, and f1-score. These metrics are chosen for their ability to validate the effectiveness of the proposed diagnostic tool. With an accuracy of 85.5%, the stacked ensemble classifier exceeded base classifiers before feature selection. After feature selection, the stacked ensemble classifier’s accuracy improved to 90.8%. These results underline the proposed method as an inexpensive and more efficient diagnostic tool for HD as compared to current methods, enabling earlier HD detection and lowering healthcare costs. In conclusion, this creative method could alter healthcare systems by providing a highly accurate and affordable diagnostic tool for clinical use.References
R. Rajendran and A. Karthi, “Heart disease prediction using entropy based feature engineering and ensembling of machine learning classifiers,” Expert Systems with Applications, vol. 207, p. 117882, Nov. 2022, doi: 10.1016/j.eswa.2022.117882.
S. Haseena, S. K. Priya, S. Saroja, R. Madavan, M. Muhibbullah, and U. Subramaniam, “Moth-Flame Optimization for Early Prediction of Heart Diseases,” Computational and Mathematical Methods in Medicine, vol. 2022, pp. 1–10, Sep. 2022, doi: 10.1155/2022/9178302.
D. Yewale, S. P. Vijayaragavan, and V. K. Bairagi, “An Effective Heart Disease Prediction Framework based on Ensemble Techniques in Machine Learning,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 2, 2023, doi: 10.14569/ijacsa.2023.0140223.
V. M. Deshmukh, “Heart Disease Prediction using Ensemble Methods,” International Journal of Recent Technology and Engineering (IJRTE), vol. 8, no. 3, pp. 521–526, Sep. 2019, doi: 10.35940/ijrte.b2046.098319.
S. Mohammad Ganie, P. Kanti Dutta Pramanik, M. Bashir Malik, A. Nayyar, and K. Sup Kwak, “An Improved Ensemble Learning Approach for Heart Disease Prediction Using Boosting Algorithms,” Computer Systems Science and Engineering, vol. 46, no. 3, pp. 3993–4006, 2023, doi: 10.32604/csse.2023.035244.
D. Asif, M. Bibi, M. S. Arif, and A. Mukheimer, “Enhancing Heart Disease Prediction through Ensemble Learning Techniques with Hyperparameter Optimization,” Algorithms, vol. 16, no. 6, p. 308, Jun. 2023, doi: 10.3390/a16060308.
N. Chandrasekhar and S. Peddakrishna, “Enhancing Heart Disease Prediction Accuracy through Machine Learning Techniques and Optimization,” Processes, vol. 11, no. 4, p. 1210, Apr. 2023, doi: 10.3390/pr11041210.
R. Rone Sarra, A. Musa Dinar, and M. Abed Mohammed, “Enhanced accuracy for heart disease prediction using artificial neural network,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 29, no. 1, p. 375, Jan. 2022, doi: 10.11591/ijeecs.v29.i1.pp375-383.
S. Y. Hera, M. Amjad, and M. K. Saba, “Improving heart disease prediction using multi-tier ensemble model,” Network Modeling Analysis in Health Informatics and Bioinformatics, vol. 11, no. 1, Oct. 2022, doi: 10.1007/s13721-022-00381-3.
P. Bizopoulos and D. Koutsouris, “Deep Learning in Cardiology,” IEEE Reviews in Biomedical Engineering, vol. 12, pp. 168–193, 2019, doi: 10.1109/rbme.2018.2885714.
D. Hamid, S. S. Ullah, J. Iqbal, S. Hussain, Ch. A. ul Hassan, and F. Umar, “A Machine Learning in Binary and Multiclassification Results on Imbalanced Heart Disease Data Stream,” Journal of Sensors, vol. 2022, pp. 1–13, Sep. 2022, doi: 10.1155/2022/8400622.
A. E. Korial, I. I. Gorial, and A. J. Humaidi, “An Improved Ensemble-Based Cardiovascular Disease Detection System with Chi-Square Feature Selection,” Computers, vol. 13, no. 6, p. 126, May 2024, doi: 10.3390/computers13060126.
S. Li, F. Li, S. Tang, and W. Xiong, “A Review of Computer-Aided Heart Sound Detection Techniques,” BioMed Research International, vol. 2020, pp. 1–10, Jan. 2020, doi: 10.1155/2020/5846191.
V. Shorewala, “Early detection of coronary heart disease using ensemble techniques,” Informatics in Medicine Unlocked, vol. 26, p. 100655, 2021, doi: 10.1016/j.imu.2021.100655.
A. E. Korial, “Brain Tumor Detection from MRI Images Using Artificial Intelligence,” International Journal on Engineering Applications (IREA), vol. 10, no. 3, p. 185, May 2022, doi: 10.15866/irea.v10i3.21213.
M. M. Taye, “Understanding of Machine Learning with Deep Learning: Architectures, Workflow, Applications and Future Directions,” Computers, vol. 12, no. 5, p. 91, Apr. 2023, doi: 10.3390/computers12050091.
B. Abdualgalil, S. Abraham, and W. M. Ismael, “Early Diagnosis for Dengue Disease Prediction Using Efficient Machine Learning Techniques Based on Clinical Data,” Journal of Robotics and Control (JRC), vol. 3, no. 3, pp. 257–268, May 2022, doi: 10.18196/jrc.v3i3.14387.
Z. Du et al., “Accurate Prediction of Coronary Heart Disease for Patients With Hypertension From Electronic Health Records With Big Data and Machine-Learning Methods: Model Development and Performance Evaluation,” JMIR Medical Informatics, vol. 8, no. 7, p. e17257, Jul. 2020, doi: 10.2196/17257.
S. A. Ebnou Abdem, J. Chenal, E. B. Diop, R. Azmi, M. Adraoui, and C. S. Tekouabou Koumetio, “Using Logistic Regression to Predict Access to Essential Services: Electricity and Internet in Nouakchott, Mauritania,” Sustainability, vol. 15, no. 23, p. 16197, Nov. 2023, doi: 10.3390/su152316197.
P. Gupta and D. D. Seth, “Improving the Prediction of Heart Disease Using Ensemble Learning and Feature Selection,” International Journal of Advances in Soft Computing and its Applications, vol. 14, no. 2, pp. 37–40, Jul. 2022, doi: 10.15849/ijasca.220720.03.
L. Miao and W. Wang, “Cardiovascular Disease Prediction Based on Soft Voting Ensemble Model,” Journal of Physics: Conference Series, vol. 2504, no. 1, p. 012021, May 2023, doi: 10.1088/1742-6596/2504/1/012021.
X.-Y. Gao, A. Amin Ali, H. Shaban Hassan, and E. M. Anwar, “Improving the Accuracy for Analyzing Heart Diseases Prediction Based on the Ensemble Method,” Complexity, vol. 2021, no. 1, Jan. 2021, doi: 10.1155/2021/6663455.
A. Tiwari, A. Chugh, and A. Sharma, “Ensemble framework for cardiovascular disease prediction,” Computers in Biology and Medicine, vol. 146, p. 105624, Jul. 2022, doi: 10.1016/j.compbiomed.2022.105624.
F. Aliyar Vellameeran and T. Brindha, “A new variant of deep belief network assisted with optimal feature selection for heart disease diagnosis using IoT wearable medical devices,” Computer Methods in Biomechanics and Biomedical Engineering, vol. 25, no. 4, pp. 387–411, Jul. 2021, doi: 10.1080/10255842.2021.1955360.
S. Diwan, G. S. Thakur, S. K. Sahu, M. Sahu, and N. K. Swamy, “Predicting Heart Diseases through Feature Selection and Ensemble Classifiers,” Journal of Physics: Conference Series, vol. 2273, no. 1, p. 012027, May 2022, doi: 10.1088/1742-6596/2273/1/012027.
S. J. Pasha and E. S. Mohamed, “Advanced hybrid ensemble gain ratio feature selection model using machine learning for enhanced disease risk prediction,” Informatics in Medicine Unlocked, vol. 32, p. 101064, 2022, doi: 10.1016/j.imu.2022.101064.
A. Kumar, K. U. Singh, and M. Kumar, “A Clinical Data Analysis Based Diagnostic Systems for Heart Disease Prediction Using Ensemble Method,” Big Data Mining and Analytics, vol. 6, no. 4, pp. 513–525, Dec. 2023, doi: 10.26599/bdma.2022.9020052.
G. A. Alshehri and H. M. Alharbi, “Prediction of Heart Disease using an Ensemble Learning Approach,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 8, 2023, doi: 10.14569/ijacsa.2023.01408118.
M. Ahmed and I. Husien, “Heart Disease Prediction Using Hybrid Machine Learning: A Brief Review,” Journal of Robotics and Control (JRC), vol. 5, no. 3, pp. 884-892, 2024, doi: 10.18196/jrc.v5i3.21606
S. Abbas, G. A. Sampedro, S. Alsubai, A. Almadhor, and T. Kim, “An Efficient Stacked Ensemble Model for Heart Disease Detection and Classification,” Computers, Materials & Continua, vol. 77, no. 1, pp. 665–680, 2023, doi: 10.32604/cmc.2023.041031.
S. E. A. Ashri, M. M. El-Gayar, and E. M. El-Daydamony, “HDPF: Heart Disease Prediction Framework Based on Hybrid Classifiers and Genetic Algorithm,” IEEE Access, vol. 9, pp. 146797–146809, 2021, doi: 10.1109/access.2021.3122789.
B. Srınıvasa Rao, “A New Ensenble Learning based Optimal Prediction Model for Cardiovascular Diseases,” E3S Web of Conferences, vol. 309, p. 01007, 2021, doi: 10.1051/e3sconf/202130901007.
K. Chandrashekar and A. T. Narayanreddy, “An Ensemble Feature Optimization for an Effective Heart Disease Prediction Model,” International Journal of Intelligent Engineering and Systems, vol. 16, no. 2, pp. 517–525, Feb. 2023, doi: 10.22266/ijies2023.0430.42.
B. Baranidharan, A. Pal, and P. Muruganandam, "Cardiovascular disease prediction based on ensemble technique enhanced using extra tree classifier for feature selection," International Journal of Recent Technology and Engineering, vol. 8, no. 3, pp. 3236-42, 2019, doi: 10.35940/ijrte.C5404.098319
A. Alqahtani, S. Alsubai, M. Sha, L. Vilcekova, and T. Javed, “Cardiovascular Disease Detection using Ensemble Learning,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1–9, Aug. 2022, doi: 10.1155/2022/5267498.
C. B. C. Latha and S. C. Jeeva, “Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques,” Informatics in Medicine Unlocked, vol. 16, p. 100203, 2019, doi: 10.1016/j.imu.2019.100203.
I. Javid, A. Khalaf, and R. Ghazali, “Enhanced Accuracy of Heart Disease Prediction using Machine Learning and Recurrent Neural Networks Ensemble Majority Voting Method,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 3, 2020, doi: 10.14569/ijacsa.2020.0110369.
N. Harika, S. R. Swamy, and Nilima, “Artificial Intelligence-Based Ensemble Model for Rapid Prediction of Heart Disease,” SN Computer Science, vol. 2, no. 6, Aug. 2021, doi: 10.1007/s42979-021-00829-9.
B. A. Tama, S. Im, and S. Lee, “Improving an Intelligent Detection System for Coronary Heart Disease Using a Two-Tier Classifier Ensemble,” BioMed Research International, vol. 2020, pp. 1–10, Apr. 2020, doi: 10.1155/2020/9816142.
S. A. Ali et al., “An Optimally Configured and Improved Deep Belief Network (OCI-DBN) Approach for Heart Disease Prediction Based on Ruzzo–Tompa and Stacked Genetic Algorithm,” IEEE Access, vol. 8, pp. 65947–65958, 2020, doi: 10.1109/access.2020.2985646.
J. Vijayashree and H. Parveen Sultana, “Heart disease classification using hybridized Ruzzo-Tompa memetic based deep trained Neocognitron neural network,” Health and Technology, vol. 10, no. 1, pp. 207–216, Jan. 2019, doi: 10.1007/s12553-018-00292-2.
R. R. Sarra, A. M. Dinar, M. A. Mohammed, M. K. A. Ghani, and M. A. Albahar, “A Robust Framework for Data Generative and Heart Disease Prediction Based on Efficient Deep Learning Models,” Diagnostics, vol. 12, no. 12, p. 2899, Nov. 2022, doi: 10.3390/diagnostics12122899.
T. K. Sajja and H. K. Kalluri, “A Deep Learning Method for Prediction of Cardiovascular Disease Using Convolutional Neural Network,” Revue d’Intelligence Artificielle, vol. 34, no. 5, pp. 601–606, Nov. 2020, doi: 10.18280/ria.340510.
R. Kapila, T. Ragunathan, S. Saleti, T. J. Lakshmi, and M. W. Ahmad, “Heart Disease Prediction Using Novel Quine McCluskey Binary Classifier (QMBC),” IEEE Access, vol. 11, pp. 64324–64347, 2023, doi: 10.1109/access.2023.3289584.
S. Bashir, A. A. Almazroi, S. Ashfaq, A. A. Almazroi, and F. H. Khan, “A Knowledge-Based Clinical Decision Support System Utilizing an Intelligent Ensemble Voting Scheme for Improved Cardiovascular Disease Prediction,” IEEE Access, vol. 9, pp. 130805–130822, 2021, doi: 10.1109/access.2021.3110604.
X. Wenxin, “Heart Disease Prediction Model Based on Model Ensemble,” 2020 3rd International Conference on Artificial Intelligence and Big Data (ICAIBD), pp. 195-199, May 2020, doi: 10.1109/icaibd49809.2020.9137483.
H. Elwahsh, E. El-shafeiy, S. Alanazi, and M. A. Tawfeek, “A new smart healthcare framework for real-time heart disease detection based on deep and machine learning,” PeerJ Computer Science, vol. 7, p. e646, Jul. 2021, doi: 10.7717/peerj-cs.646.
S. Richa and S. Shailendra Narayan, “Towards Accurate Heart Disease Prediction System: An Enhanced Machine Learning Approach,” International Journal of Performability Engineering, vol. 18, no. 2, p. 136, 2022, doi: 10.23940/ijpe.22.02.p8.136148.
S. P. Patro, N. Padhy, and R. D. Sah, “An improved ensemble learning approach for the prediction of cardiovascular disease using majority voting prediction,” International Journal of Modelling, Identification and Control, vol. 41, no. 1/2, p. 68, 2022, doi: 10.1504/ijmic.2022.127098.
J. M.-T. Wu et al., “Applying an ensemble convolutional neural network with Savitzky–Golay filter to construct a phonocardiogram prediction model,” Applied Soft Computing, vol. 78, pp. 29–40, May 2019, doi: 10.1016/j.asoc.2019.01.019.
K. Dissanyake and M. G. M. Johar, “Two-level boosting classifiers ensemble based on feature selection for heart disease prediction,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 32, no. 1, p. 381, Oct. 2023, doi: 10.11591/ijeecs.v32.i1.pp381-391.
N. V. MahaLakshmi and R. K. Rout, “Effective heart disease prediction using improved particle swarm optimization algorithm and ensemble classification technique,” Soft Computing, vol. 27, no. 15, pp. 11027–11040, May 2023, doi: 10.1007/s00500-023-08388-2.
V. Jain and K. L. Kashyap, “Multilayer hybrid ensemble machine learning model for analysis of Covid-19 vaccine sentiments,” Journal of Intelligent & Fuzzy Systems, vol. 43, no. 5, pp. 6307–6319, Sep. 2022, doi: 10.3233/jifs-220279.
D. Mohapatra, S. K. Bhoi, C. Mallick, K. K. Jena, and S. Mishra, “Distribution preserving train-test split directed ensemble classifier for heart disease prediction,” International Journal of Information Technology, vol. 14, no. 4, pp. 1763–1769, Jan. 2022, doi: 10.1007/s41870-022-00868-2.
I. D. Mienye, Y. Sun, and Z. Wang, “An improved ensemble learning approach for the prediction of heart disease risk,” Informatics in Medicine Unlocked, vol. 20, p. 100402, 2020, doi: 10.1016/j.imu.2020.100402.
S. Mohan, C. Thirumalai, and G. Srivastava, “Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques,” IEEE Access, vol. 7, pp. 81542–81554, 2019, doi: 10.1109/access.2019.2923707.
L. Ali, C. Zhu, M. Zhou, and Y. Liu, “Early diagnosis of Parkinson’s disease from multiple voice recordings by simultaneous sample and feature selection,” Expert Systems with Applications, vol. 137, pp. 22–28, Dec. 2019, doi: 10.1016/j.eswa.2019.06.052.
A. Rahim, Y. Rasheed, F. Azam, M. W. Anwar, M. A. Rahim, and A. W. Muzaffar, “An Integrated Machine Learning Framework for Effective Prediction of Cardiovascular Diseases,” IEEE Access, vol. 9, pp. 106575–106588, 2021, doi: 10.1109/access.2021.3098688.
E. K. Wang, X. Zhang, and L. Pan, “Automatic Classification of CAD ECG Signals With SDAE and Bidirectional Long Short-Term Network,” IEEE Access, vol. 7, pp. 182873–182880, 2019, doi: 10.1109/access.2019.2936525.
J. Ivan and S. Y. Prasetyo, “Heart Disease Prediction Using Ensemble Model and Hyperparameter Optimization,” International Journal on Recent and Innovation Trends in Computing and Communication, vol. 11, no. 8s, pp. 290–295, Aug. 2023, doi: 10.17762/ijritcc.v11i8s.7208.
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This journal is based on the work at https://journal.umy.ac.id/index.php/jrc under license from Creative Commons Attribution-ShareAlike 4.0 International License. You are free to:
- Share – copy and redistribute the material in any medium or format.
- Adapt – remix, transform, and build upon the material for any purpose, even comercially.
The licensor cannot revoke these freedoms as long as you follow the license terms, which include the following:
- Attribution. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- ShareAlike. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
- No additional restrictions. You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
• Creative Commons Attribution-ShareAlike (CC BY-SA)
JRC is licensed under an International License