Naive Bayes for Diabetes Prediction: Developing a Classification Model for Risk Identification in Specific Populations
DOI:
https://doi.org/10.18196/st.v27i1.21008Keywords:
Diabetes, naïve bayes, performance matrixAbstract
Depending on persuasive statistics, the increasing prevalence of diabetes worldwide is a huge challenge for individuals, families, and nations. According to International Diabetes Federation (IDF) projections, the number of adults with diabetes is expected to rise by an astounding 46% by 2045, to reach 783 million, or one in eight. In response to this growing concern, this research explores the implementation of the Naive Bayes algorithm for predicting diabetes, employing comprehensive data cleansing and randomization techniques. A systematic evaluation of the model's performance is conducted using several training and testing split ratios (65:35, 75:25, 85:15). The outcome showed that the model performed best at the 65:35 split ratio, with accuracy reaching its maximum of 88.16%, precision 0.883, recall 0.881, and f1-score 0.882.References
Abbas, M., Memon, K. A., Jamali, A. A., Memon, S., & Ahmed, A. (2019). Multinomial Naive Bayes classification model for sentiment analysis. IJCSNS Int. J. Comput. Sci. Netw. Secur, 19(3), 62.
Al-Mohaithef, M., Abdelmohsen, S. A., Algameel, M., & Abdelwahed, A. Y. (2022). Screening for identification of patients at high risk for diabetes-related foot ulcers: a cross-sectional study. Journal of
International Medical Research, 50(3), 03000605221087815.
Arokiasamy, P., Salvi, S., & Selvamani, Y. (2021a). Global burden of diabetes mellitus: prevalence, pattern, and trends. Handbook of Global Health, 495–538.
Arokiasamy, P., Salvi, S., & Selvamani, Y. (2021b). Global burden of diabetes mellitus. In Handbook of global health (pp. 1–44). Springer.
Association, A. D. (2019). Standards of medical care in diabetes—2019 abridged for primary care providers. Clinical Diabetes: A Publication of the American Diabetes Association, 37(1), 11.
Blanquero, R., Carrizosa, E., Ramírez-Cobo, P., & Sillero-Denamiel, M. R. (2021). Variable selection for Naïve Bayes classification. Computers & Operations Research, 135, 105456.
Charley, E., Dinner, B., Pham, K., & Vyas, N. (2023). Diabetes as a consequence of acute pancreatitis. World Journal of Gastroenterology, 29(31), 4736.
Chen, S., Webb, G. I., Liu, L., & Ma, X. (2020). A novel selective naïve Bayes algorithm. Knowledge-Based Systems, 192, 105361. https://doi.org/10.1016/j.knosys.2019.105361
Esposito, S., Toni, G., Tascini, G., Santi, E., Berioli, M. G., & Principi, N. (2019). Environmental factors associated with type 1 diabetes. Frontiers in Endocrinology, 10, 592.
Hassan, M. M., Rony, M. A. T., Khan, M. A. R., Hassan, M. M., Yasmin, F., Nag, A., Zarin, T. H., Bairagi, A. K., Alshathri, S., & El-Shafai, W. (2023). Machine Learning-Based Rainfall Prediction: Unveiling
Insights and Forecasting for Improved Preparedness. IEEE Access, 11, 132196–132222.
Janež, A., Guja, C., Mitrakou, A., Lalic, N., Tankova, T., Czupryniak, L., Tabák, A. G., Prazny, M., Martinka, E., & Smircic-Duvnjak, L. (2020). Insulin therapy in adults with type 1 diabetes mellitus: a narrative review. Diabetes Therapy, 11, 387–409.
Khursheed, R., Singh, S. K., Wadhwa, S., Kapoor, B., Gulati, M., Kumar, R., Ramanunny, A. K., Awasthi, A., & Dua, K. (2019). Treatment strategies against diabetes: Success so far and challenges ahead. European Journal of Pharmacology, 862, 172625.
Kuo, F. Y., Cheng, K.-C., Li, Y., & Cheng, J.-T. (2021). Oral glucose tolerance test in diabetes, the old method revisited. World Journal of Diabetes, 12(6), 786.
Magkos, F., Hjorth, M. F., & Astrup, A. (2020). Diet and exercise in the prevention and treatment of type 2 diabetes mellitus. Nature Reviews Endocrinology, 16(10), 545–555.
Maswadi, K., Ghani, N. A., Hamid, S., & Rasheed, M. B. (2021). Human activity classification using Decision Tree and Naive Bayes classifiers. Multimedia Tools and Applications, 80, 21709–21726.
Patil, R., & Gothankar, J. (2019). Risk factors for type 2 diabetes mellitus: An urban perspective. Indian Journal of Medical Sciences, 71(1), 16–21.
Raschka, S., Patterson, J., & Nolet, C. (2020). Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence. Information, 11(4), 193.
Sousa, A. P., Cunha, D. M., Franco, C., Teixeira, C., Gojon, F., Baylina, P., & Fernandes, R. (2021). Which role plays 2-hydroxybutyric acid on insulin resistance? Metabolites, 11(12), 835.
Wickramasinghe, I., & Kalutarage, H. (2021). Naive Bayes: applications, variations and vulnerabilities: a review of literature with code snippets for implementation. Soft Computing, 25(3), 2277–2293.
Downloads
Additional Files
Published
How to Cite
Issue
Section
License
Semesta Teknika is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).