Improving Life-threatening Lung Diseases Classification using Hybrid SMOTE-ENN with assorted Machine Learning Classifiers

Document Type : Original Article

Authors

1 Department of Electronics and Communications, Mansoura University, Daqahliyah , Egypt

2 Electronics and communications department, Faculty of Engineering, Mansoura University, Egypt

3 Department of Electronics and Communications Engineering at the Faculty of Engineering, Mansoura Uni-versity

Abstract

Chest radiography is one of the most common diagnostic tools for diagnosing and managing bronchopneumonia and other lung diseases. In this paper, a classification strategy was proposed for identifying infection in Chest X-ray images. We collected 7545 x-ray chest images from an openly accessible X-ray database and separated them into three classes: healthy individuals, persons suffering from pneumonia, and additional COVID-19 patients. The contrast limited adaptive histogram equalization (CLAHE) method was used to improve the quality of the X-ray images. The oriented gradient histogram (HOG) is used. The classification of datasets in medi-cine sometimes is hindered by the problem of having unequal datasets. In the solving of this problem, which occurs during imbalanced data classification in medical diagnosis, we introduce a hybrid sampling technique called SMOTE-ENN that is a combination of the Synthetic minority oversampling technique (SMOTE) and Edited Nearest Neighbors (ENN). The support vector machine (SVM), k-Nearest Neighbors (k-NN), and Random Forest Classifier (RFC) used to clas-sify the images, with classification rates of 99.47%, 98.70%, and 98.47%, respectively, on a test dataset of 1504 images. These findings may help to detect COVID-19 and pneumonia diseases more effectively.

Keywords