Naive Bayes for Diabetes Prediction: Developing a Classification Model for Risk Identification in Specific Populations

Ahmad Zaki Arrayyan, Hendra Setiawan, Karisma Trinanda Putra

Abstract


Depending on persuasive statistics, the increasing prevalence of diabetes worldwide is a huge challenge for individuals, families, and nations. According to International Diabetes Federation (IDF) projections, the number of adults with diabetes is expected to rise by an astounding 46% by 2045, to reach 783 million, or one in eight. In response to this growing concern, this research explores the implementation of the Naive Bayes algorithm for predicting diabetes, employing comprehensive data cleansing and randomization techniques. A systematic evaluation of the model's performance is conducted using several training and testing split ratios (65:35, 75:25, 85:15). The outcome showed that the model performed best at the 65:35 split ratio, with accuracy reaching its maximum of 88.16%, precision 0.883, recall 0.881, and f1-score 0.882.

Keywords


Diabetes; naïve bayes; performance matrix

Full Text:

PDF

References


Abbas, M., Memon, K. A., Jamali, A. A., Memon, S., & Ahmed, A. (2019). Multinomial Naive Bayes classification model for sentiment analysis. IJCSNS Int. J. Comput. Sci. Netw. Secur, 19(3), 62.

Al-Mohaithef, M., Abdelmohsen, S. A., Algameel, M., & Abdelwahed, A. Y. (2022). Screening for identification of patients at high risk for diabetes-related foot ulcers: a cross-sectional study. Journal of

International Medical Research, 50(3), 03000605221087815.

Arokiasamy, P., Salvi, S., & Selvamani, Y. (2021a). Global burden of diabetes mellitus: prevalence, pattern, and trends. Handbook of Global Health, 495–538.

Arokiasamy, P., Salvi, S., & Selvamani, Y. (2021b). Global burden of diabetes mellitus. In Handbook of global health (pp. 1–44). Springer.

Association, A. D. (2019). Standards of medical care in diabetes—2019 abridged for primary care providers. Clinical Diabetes: A Publication of the American Diabetes Association, 37(1), 11.

Blanquero, R., Carrizosa, E., Ramírez-Cobo, P., & Sillero-Denamiel, M. R. (2021). Variable selection for Naïve Bayes classification. Computers & Operations Research, 135, 105456.

Charley, E., Dinner, B., Pham, K., & Vyas, N. (2023). Diabetes as a consequence of acute pancreatitis. World Journal of Gastroenterology, 29(31), 4736.

Chen, S., Webb, G. I., Liu, L., & Ma, X. (2020). A novel selective naïve Bayes algorithm. Knowledge-Based Systems, 192, 105361. https://doi.org/10.1016/j.knosys.2019.105361

Esposito, S., Toni, G., Tascini, G., Santi, E., Berioli, M. G., & Principi, N. (2019). Environmental factors associated with type 1 diabetes. Frontiers in Endocrinology, 10, 592.

Hassan, M. M., Rony, M. A. T., Khan, M. A. R., Hassan, M. M., Yasmin, F., Nag, A., Zarin, T. H., Bairagi, A. K., Alshathri, S., & El-Shafai, W. (2023). Machine Learning-Based Rainfall Prediction: Unveiling

Insights and Forecasting for Improved Preparedness. IEEE Access, 11, 132196–132222.

Janež, A., Guja, C., Mitrakou, A., Lalic, N., Tankova, T., Czupryniak, L., Tabák, A. G., Prazny, M., Martinka, E., & Smircic-Duvnjak, L. (2020). Insulin therapy in adults with type 1 diabetes mellitus: a narrative review. Diabetes Therapy, 11, 387–409.

Khursheed, R., Singh, S. K., Wadhwa, S., Kapoor, B., Gulati, M., Kumar, R., Ramanunny, A. K., Awasthi, A., & Dua, K. (2019). Treatment strategies against diabetes: Success so far and challenges ahead. European Journal of Pharmacology, 862, 172625.

Kuo, F. Y., Cheng, K.-C., Li, Y., & Cheng, J.-T. (2021). Oral glucose tolerance test in diabetes, the old method revisited. World Journal of Diabetes, 12(6), 786.

Magkos, F., Hjorth, M. F., & Astrup, A. (2020). Diet and exercise in the prevention and treatment of type 2 diabetes mellitus. Nature Reviews Endocrinology, 16(10), 545–555.

Maswadi, K., Ghani, N. A., Hamid, S., & Rasheed, M. B. (2021). Human activity classification using Decision Tree and Naive Bayes classifiers. Multimedia Tools and Applications, 80, 21709–21726.

Patil, R., & Gothankar, J. (2019). Risk factors for type 2 diabetes mellitus: An urban perspective. Indian Journal of Medical Sciences, 71(1), 16–21.

Raschka, S., Patterson, J., & Nolet, C. (2020). Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence. Information, 11(4), 193.

Sousa, A. P., Cunha, D. M., Franco, C., Teixeira, C., Gojon, F., Baylina, P., & Fernandes, R. (2021). Which role plays 2-hydroxybutyric acid on insulin resistance? Metabolites, 11(12), 835.

Wickramasinghe, I., & Kalutarage, H. (2021). Naive Bayes: applications, variations and vulnerabilities: a review of literature with code snippets for implementation. Soft Computing, 25(3), 2277–2293.




DOI: https://doi.org/10.18196/st.v27i1.21008

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Ahmad Zaki Arrayyan, Hendra Setiawan, Karisma Trinanda Putra

Editorial Office :

SEMESTA TEKNIKA

Faculty of Engineering, Universitas Muhammadiyah Yogyakarta.

Jln. Brawijaya Tamantirto Kasihan Bantul 55183 Indonesia

Telp:(62)274-387656, Fax.:(62)274-387656

Email: semesta_teknika@umy.ac.id, semestateknika@umy.university

Website: http://http://journal.umy.ac.id/index.php/st

Creative Commons License

Semesta Teknika is licensed under a Creative Commons Attribution 4.0 International License.