Anamika ., Isha Tyagi


Diabetes is observed to be among the most perilous diseases and chronic diseases. A multitude of problems arise if the disease is left unattended and untreated. The prosaic task of identification of the problem aggregates to a patient visiting a doctor in a medical center for deliberation. However,the rise of machine learning methodologies solve this severe problem. The incentive of this study is to examine the model which can prefigure the plausibility of diabetes in patients with maximal accuracy. Thereupon, the four algorithms namely Decision Tree, Random Forest, Naive Bayes and Adaboost Classifier are utilized in this research for predicting diabetes at an initial stage. This paper aims at testing befitting algorithms for the prediction of diabetes.

Experiments are conducted on two datasets namely Pima India Diabetes Database(PIDD) which is referenced  from UCI machine learning repository and an auxiliary database. The efficiency of all four algorithms are assessed on the basis of Accuracy, Precision, Recall and F-Measure. Using these all four algorithms discussed above the result that was acquired reveal Adaboost Classifier exceeds the other algorithms with the highest accuracy of 85.5% for PIDD and 95.4% for the aforementioned dataset. The result that was obtained  using Receiver Operating Characteristic (ROC) curves in a sequential manner.The efficie


Diabetes; Random Forest; Naive Bayes; Decision Tree; Adaboost; Accuracy

Full Text:



Aishwarya, R., Gayathri, P., Jaisankar, N., (2013). A Method for Classification Using Machine Learning Technique for Diabetes. International Journal of Engineering and Technology (IJET) 5, 2903–2908

Aljumah, A.A., Ahamad, M.G., Siddiqui, M.K., (2013). Application of data mining: Diabetes health care in young and old patients. Journal of King Saud University - Computer and Information Sciences 25, 127–136. doi:10.1016/j.jksuci.2012.10.003.

Arora, R., Suman, (2012). Comparative Analysis of Classification Algorithms on Different Datasets using WEKA. International Journal of Computer Applications 54, 21–25. doi:10.5120/8626-2492.

Bamnote, M.P., G.R., (2014). Design of Classifier for Detection of Diabetes Mellitus Using Genetic Programming. Advances in Intelligent Systems and Computing 1, 763–770. doi:10.1007/978-3-319-11933-5.

Choubey, D.K., Paul, S., Kumar, S., Kumar, S., (2017). Classification of Pima indian diabetes dataset using naive bayes with genetic algorithm as an attribute selection, in: Communication and Computing Systems: Proceedings of the International Conference on Communication and Computing System (ICCCS 2016), pp. 451–455.

Dhomse Kanchan B., M.K.M., (2016). Study of Machine Learning Algorithms for Special Disease Prediction using Principal of Component Analysis, in: 2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication, IEEE. pp. 5–10.

Esposito, F., Malerba, D., Semeraro, G., Kay, J., (1997). A comparative analysis of methods for pruning decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 476–491. doi:10.1109/34.589207.

Fatima, M., Pasha, M., (2017). Survey of Machine Learning Algorithms for Disease Diagnostic. Journal of Intelligent Learning Systems and Applications 09, 1–16. doi:10.4236/jilsa.2017.91001.

Garner, S.R., (1995). Weka: The Waikato Environment for Knowledge Analysis, in: Proceedings of the New Zealand computer science research students conference, Citeseer. pp. 57–64.

Han, J., Rodriguez, J.C., Beheshti, M., (2008). Discovering decision tree based diabetes prediction model, in: International Conference onAdvanced Software Engineering and Its Applications, Springer. pp. 99–109.

Iyer, A., S, J., Sumbaly, R., (2015). Diagnosis of Diabetes Using Classification Mining Techniques. International Journal of Data Mining & Knowledge Management Process 5, 1–14. doi:10.5121/ijdkp.2015.5101, arXiv:1502.03774.

Iancu, I., Mota, M., and Iancu, E. (2008). “Method for the analysing of blood glucose dynamics in diabetes mellitus patients,” in Proceedings of the 2008 IEEE International Conference on Automation, Quality and Testing, Robotics, Cluj-Napoca. doi: 10.1109/AQTR.2008.4588883

Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., Chouvarda, I., (2017). Machine Learning and Data Mining Methods in Diabetes Research. Computational and Structural Biotechnology Journal 15, 104–116. doi:10.1016/j.csbj.2016.12.005.

Krasteva, A., Panov, V., Krasteva, A., Kisselova, A., and Krastev, Z. (2011). Oral cavity and systemic diseases—Diabetes Mellitus. Biotechnol. Biotechnol. Equip. 25, 2183–2186. doi: 10.5504/BBEQ.2011.0022

Kumar, D.A., Govindasamy, R., (2015). Performance and Evaluation of Classification Data Mining Techniques in Diabetes. International Journal of Computer Science and Information Technologies, 6, 1312–1319.

Kumar, P.S., Umatejaswi, V., (2017). Diagnosing Diabetes using Data Mining Techniques. International Journal of Scientific and Research Publications 7, 705–709.

Kumari, V.A., Chitra, R., (2013). Classification Of Diabetes Disease Using Support Vector Machine. International Journal of Engineering Research and Applications (IJERA) 3, 1797–1801.

Mujumdar, Aishwarya, and V. Vaidehi.(2019). "Diabetes prediction using machine learning algorithms." Procedia Computer Science 165 (2019): 292-299.

Nai-Arun, N., Moungmai, R.(2015). Comparison of Classifiers for the Risk of Diabetes Prediction. Procedia Computer Science 69, 132–142. doi:10.1016/j.procs.2015.10.014.

Nai-Arun, N., Sittidech, P.(2014). Ensemble Learning Model for Diabetes Classification. Advanced Materials Research 931 - 932, 1427–1431. doi:10.4028/

Orabi, K.M., Kamal, Y.M., Rabah, T.M. (2016). Early Predictive System for Diabetes Mellitus Disease, in: Industrial Conference on Data Mining, Springer. Springer. pp. 420–427.

Perveen, S., Shahbaz, M., Guergachi, A., Keshavjee, K.(2016). Performance Analysis of Data Mining Classification Techniques to Predict Diabetes. Procedia Computer Science 82, 115–121. doi:10.1016/j.procs.2016.04.016.

Pradhan, P.M.A., Bamnote, G.R., Tribhuvan, V., Jadhav, K., Chabukswar, V., Dhobale, V. (2012). A Genetic Programming Approach for Detection of Diabetes. International Journal Of Computational Engineering Research 2, 91–94.

Priyam, A., Gupta, R., Rathee, A., Srivastava, S.(2013). Comparative Analysis of Decision Tree Classification Algorithms. International Journal of Current Engineering and Technology Vol.3, 334–337. doi:JUNE 2013, arXiv:ISSN 2277 - 4106.

Ray, S.(2017). 6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python).

Rish, I.(2001). An empirical study of the naive Bayes classifier, in: IJCAI 2001 workshop on empirical methods in artificial intelligence, IBM. pp. 41–46.

Robertson, G., Lehmann, E. D., Sandham, W., and Hamilton, D. (2011). Blood glucose prediction using artificial neural networks trained with the AIDA diabetes simulator: a proof-of-concept pilot study. J. Electr. Comput. Eng. 2011:681786. doi: 10.1155/2011/681786

Sharief, A.A., Sheta, A.(2014). Developing a Mathematical Model to Detect Diabetes Using Multigene Genetic Programming. International Journal of Advanced Research in Artificial Intelligence (IJARAI) 3, 54–59. doi:doi:10.14569/IJARAI.2014.031007.

Sisodia, D., Shrivastava, S.K., Jain, R.C.(2010). ISVM for face recognition. Proceedings - 2010 International Conference on Computational Intelligence and Communication Networks, CICN 2010 , 554–559doi:10.1109/CICN.2010.109.

Sisodia, D., Singh, L., Sisodia, S.(2014). Fast and Accurate Face Recognition Using SVM and DCT, in: Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012, Springer. pp. 1027–1038.

Tarik A. Rashid, S.M.A., Abdullah, R.M., Abstract, (2016). An Intelligent Approach for Diabetes Classification, Prediction and Description. Advances in Intelligent Systems and Computing 424, 323–335. doi:10.1007/978-3-319-28031-8.

Vijayan, V.V., Anjali, C.(2015). Prediction and diagnosis of diabetes mellitus A machine learning approach. 2015 IEEE Recent Advances in Intelligent Computational Systems (RAICS) , 122–127doi:10.1109/RAICS.2015.7488400.

Yu, W., Liu, T., Valdez, R., Gwinn, M., Khoury, M.J.(2010). Application of support vector machine modeling for prediction of common diseases: The case of diabetes and pre-diabetes. BMC Medical Informatics and Decision Making 10. doi:10.1186/1472-6947-10-16

Zou, Q., Qu, K., Luo, Y., Yin, D., Ju, Y., & Tang, H. (2018). Predicting diabetes mellitus with machine learning techniques. Frontiers in genetics, 9, 515.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.