• Sayali R. Nipanikar Department of Computer Engineering, PCCOE, Pune
  • Dr. K. Rajeswari Department of Computer Engineering, PCCOE, Pune


machine learning, random forest, decision tree algorithms


The abundance of type and quantity of available data in the healthcare field has led many to utilize machine learning approaches to keep up with this influx of data. Data pertaining to COVID-19 is an area of recent interest. The widespread influence of the virus across the United States creates an obvious need to identify groups of individuals that are at an increased risk of mortality from the virus. We propose a so-called clustered random forest approach to predict COVID-19 patient mortality. We use this approach to examine the hidden heterogeneity of patient frailty by examining demographic information for COVID-19 patients. We find that our clustered random forest approach attains predictive performance comparable to other published methods. We also find that follow-up analysis with decision tree algorithms and linear regression provide insight into the type and magnitude of mortality risks associated with COVID-19.


I. Cao Y, Li L, Feng Z, Wan S, Huang P, Sun X, Wen F, Huang X, Ning G, Wang W. Comparative genetic analysis of the novel coronavirus (2019-nCoV/SARS-CoV-2) receptor ACE2 in different populations. Cell Discov 6: 11, 2020. doi:10.1038/s41421-020-0147-1.

II. COVID-19. Open Research Dataset (CORD-19). 2020, https://pages.

III. Fang L, Karakiulakis G, Roth M. Are patients with hypertension and diabetes mellitus at increased risk for COVID-19 infection? Lancet Respir Med In press, 2020. doi:10.1016/S2213-2600(20)30116-8.

IV. Ge Y, Tian T, Huang S, Wan F, Li J, Li S, Yang H, Hong L, Wu N, Yuan E, Cheng L, Lei Y, Shu H, Feng X, Jiang Z, Chi Y, Guo X, Cui L, Xiao L, Li Z, Yang C, Miao Z, Tang H, Chen L, Zeng H, Zhao D, Zhu F, Shen X, Zeng J. A data-driven drug repositioning framework discovered a potential therapeutic agent targeting COVID-19. bioRxiv, 2020. doi:10.1101/2020.03.11.986836.

V. Gozes O, Frid-Adar M, Greenspan H, Browning PD, Zhang H, Ji W, Bernheim A, Siegel E. Rapid AI Development Cycle for the Coronavirus (COVID-19) Pandemic: Initial Results for Automated Detection & Patient Mon- itoring using Deep Learning CT Image Analysis. arXiv2003.05037. 2020.

VI. Metsky HC, Freije CA, Kosoko-Thoroddsen T-SF, Sabeti PC, Myhr- vold C. CRISPR-based COVID-19 surveillance using a genomically- comprehensive machine learning approach. bioRxiv, 2020. doi:10.1101/ 2020.02.26.967026.

VII. Ong E, Wong MU, Huffman A, He Y. COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning. bioRxiv. 2020. doi:10.1101/2020.03.20.000141.

VIII. Randhawa GS, Soltysiak MPM, El Roz H, de Souza CPE, Hill KA, Kari L. Machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: COVID-19 case study. bioRxiv, 2020.

IX. Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Žídek A, Nelson AWR, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones DT, Silver D, Kavukcuoglu K, Hassabis D. Improved protein structure prediction using potentials from deep learning. Nature 577: 706 –710, 2020. doi:10.1038/s41586-019- 1923-7.

X. Wang Y, Hu M, Li Q, Zhang X-P, Zhai G, Yao N. Abnormal respiratory patterns classifier may contribute to large-scale screening of people in- fected with COVID-19 in an accurate and unobtrusive manner. arXiv2002.05534. 2020.

XI. Yan L, Zhang H-T, Xiao Y, Wang M, Sun C, Liang J, Li S, Zhang M, Guo Y, Xiao Y. Prediction of survival for severe Covid-19 patients with three clinical features: development of a machine learning-based prognos- tic model with clinical data in Wuhan. medRxiv. 2020. doi:10.1101/2020. 02.27.20028027.

XII. Zhavoronkov A, Aladinskiy V, Zhebrak A, Zagribelnyy B, Terentiev V, Bezrukov DS, Polykovskiy D, Shayakhmetov R, Filimonov A, Orekhov P. Potential COVID-2019 3C-like Protease Inhibitors Designed Using Generative Deep Learning Approaches. Insilico Med Hong Kong Ltd A 307: E1, 2020.

XIII. Zhou P, Yang X-L, Wang X-G, Hu B, Zhang L, Zhang W, Si H-R, Zhu Y, Li B, Huang C-L, Chen HD, Chen J, Luo Y, Guo H, Jiang RD, Liu MQ, Chen Y, Shen XR, Wang X, Zheng XS, Zhao K, Chen QJ, Deng F, Liu LL, Yan B, Zhan FX, Wang YY, Xiao GF, Shi ZL. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579: 270 –273, 2020. doi:10.1038/s41586-020-2012-7.





XVIII. COVID-19 Future Forecasting Using Supervised Machine Learning Models | IEEE Journals & Magazine | IEEE Xplore

XIX. Using Automated Machine Learning to Predict the Mortality of Patients With COVID-19: Prediction Model Development Study (

XX. Prediction of COVID-19 Using Genetic Deep Learning Convolutional Neural Network (GDCNN) | IEEE Journals & Magazine | IEEE Xplore

Additional Files



How to Cite

Sayali R. Nipanikar, & Dr. K. Rajeswari. (2022). COVID-19 PREDICTION USING MACHINE LEARNING TECHNIQUES. International Education and Research Journal (IERJ), 8(7). Retrieved from