A Deep Learning Model for Coronavirus 2019 Pneumonia Screening

. This paper proposes an automatic COVID-19 detection model using chest X-ray images. The proposed model is developed to give accurate diagnostics for multiclass classification (COVID vs. PNEUMONIA vs. NORMAL). Our model produced a classification accuracy of 96%. Evaluation has been done on publicly available databases containing covid19, pneumonia and normal X-ray images. The proposed approach uses the VGG-16 model with pre-trained weights in the initialization step.


Introduction
Coronavirus disease 2019 appeared in Wuhan, China in late 2019 and spread all over the world during the year 2020, the World_Health_Organization (WHO) declared COVID-19 a pandemic.
In this specific scenario, we decided to create an automated classification model based on computer vision techniques.It's a fast, precise, and simple deep learning models may be useful to overcome the pandemic of COVID-19 and provide assistance to patients to receive the right care at the right time.In spite of the fact that radiologists play an important role because of their major experience in this area, artificial intelligence (AI) tools [1] in radiology can offer assistance to get a correct diagnosis.Moreover, AI approaches can be supportive in disposing of downsides, such as inadequately accessible RT-PCR [2] test packs, test costs, and test result wait times.Also, we can't differentiate between X-rays if they haven't been labeled.But, a convolutional neural network can do this task.
Lately, numerous AI approaches have been used for COVD19 detection.But perhaps the most curious approach to this problem has been proposed by Amakdouf [3].He used a deep learning model called Ratio-Memory Cellular Neural Net-work (RMCNN) to classify these three classes and obtained an accuracy of 0.67 (67%).
The paper uses a deep Convolutional Neural Network (CNN) to detect and classify COVID19.The suggested approach takes chest X-ray images to return the result and depends on the VGG-16 Network pre-trained weights as initialization.The primary purpose of this approach is to maximize classification accuracy The paper is presented as follows.The data used, the proposed model for detecting COVID-19, and our implementation are all discussed in Section 2. Section 3 describes the experimental results, analysis used to verify the quality of the proposed approach.Section 4 presents an evaluation of the model, comparison of performance measures for both the proposed model and the RMCNN model.

Experimental Framework
The proposed method, as shown in Figure 1, consists of the four phase: data augmentation (i), resize images (ii), normalize data (iii), modeling and training of the neural network to learn the difference between output classes (iv).

Dataset
This study's dataset was made available on Kaggle [4] and Github [5].It has 3001 X-ray scans of the chest.From a total of 3001 images, 480 photos identified as Coronavirus, 1180 images identified as Pneumonia, and 1341 images identified as normal.This data set can be divided into three categories: NORMAL, CORONAVIRUS, and PNEUMONIA.The three classes have varied numbers of instances, as can be seen.
Imbalanced classes are a big problem for classification algorithms since the majority class will influence the cost function.Balancing the training set is a simple way to handle this problem in the dataset.To obtain a balanced dataset [7], we propose to use data augmentation [6] only on data "COVID".

Image Preprocessing
The convolutional neural network (CNN) always takes images with the same dimensions.The largest image in the sample has dimensions of 2388 × 2063 pixels, while the smallest image has dimensions of 1320 × 780 pixels.We resized the images to fit the input dimension for the CNN.presents the resized image.

CNN Model
In this section, we will see some of the important details of the CNN architecture used in this research.
The network takes a red, green, and blue input image of size 224x224x3.The first two layers have a 3×3 convolution layer with 64 filters, followed by a maxpooling layer, this layer scales down 224x224x64 images to 112x112x64 images.Then there are two layers have a 3×3 convolution layer with 128 filters, followed by a max-pooling layer, this layer scales down 112x112x128 images to 56x56x128 images.The next two layers have a 3×3 convolution layer with 256 filters, followed by a max-pooling layer, this layer scales down 56x56x128 images to 28x28x256 images.After that, there are two sets of three 3×3 convolution layers followed by a maxpooling layer.Each has 512 filters.And then we have two fully-connected layers.Each has 4096 nodes.Each node of the first fully-connected is connected to all of the 25088 nodes 7x7x512 that came from the previous layer, and the second fully-connected is connected to the output layer with 3 nodes.As shown in Table 2, the loss/Cost function used in this network was Categorical Cross-Entropy; it can be computed by the following equation.

Performance Measures
To describe the performance of a classification model, we use a Confusion Matrix (CM) which is a table that allows visualization of the performance of a model on a set of test data for which the output values are known.To check the network's performance, we use measures based on the values included in a CM.The most widely used metric for assessing categorization ability is accuracy, in the cross-validation step; the accuracy (acc) was obtained every iteration.This measure gives the percentage of samples that are correctly classified, which is represented in equation ( 3).

( )
Precision evaluates the ability of the network to return only relevant results.It is represented in equation ( 4).
( ) The sensitivity or recall evaluates the ability of the network to identify all relevant results, and this can be calculated with equation ( 5).

( )
The F1-score is a single metric that uses the harmonic mean to combine recall and precision.As shown in equation (7).

Validation Results
As shown in the training history graph, although the training data is limited, the Accuracy eventually reaches almost 96%, while the Loss decreases significantly.

Test Results
After a set of test experiments, we chose the model that gave a better result on the validation set, and then evaluate the model on the test set taking into account all performance measures.

Table 4:
The results of performance metrics for our model.

Model Confusion Matrix
Accuracy Precision Recall F1-Score

(b)
Table 6 gives the results of performance measures for the proposed model and the RMCNN model.The RMCNN was the worst in training outcomes compared to the VGG-16 model, as its accuracy curve had a low accuracy of 67% at the 144th epoch.

Discussion
In this study, we have used a total of 3001 images: 2521 images from the Kaggle repository [4], and an additional 480 COVID-19 chest x-ray images [5]

Conclusion
In this paper, we propose a deep learning-based model to automatically identify chest radiographs of people with coronavirus, labeled COVID, people with pneumonia, labeled PNEUMONIA, and people without any lung disease, labeled NORMAL.Our idea is to use a transfer learning technique to use the pre-trained weights from the VGG16 convolutional neural network on ImageNet as initial weights for the new model.Our developed model is able to classify these three classes with an accuracy of 96.08%.The point of this study is to make a comparison of VGG-16 and RMCNN with different performance metrics.The study results showed that, on the whole, the VGG-16 model is more accurate than the RMCNN model.The proposed approach can be readily used in all health centers across the world during the pandemic.The model is able to identify chest-related diseases such as covid-19 and pneumonia, within seconds.

Fig. 1 .
Fig. 1.The general diagram of the proposed method

Fig. 3 :
Fig. 3: image (a) presents the original image, and image (b)presents the resized image.

Fig. 4 :
Fig. 4: Plot of Model Accuracy/ Loss on Train and Validation Datasets..

4
Evaluation of the modelTo compare the performance of the proposed model to the RMCNN model[3], we first performed some experiments using the training set and then tested them on the validation and the test set.Table 5 summarizes the performance results of our proposed model (a) and the RMCNN model (b)

Table 1 :
VGG16 architecture.Use After several initial experiments, we have obtained the model with the best performance.Table 2 displays the hyper-parameters of the final model utilized in this research.

Table 2 :
Hyper-parameters of the model

Table 3
represents the Confusion Matrix obtained in this research.

Table 5 :
Performance of both models on the testing data set

Table 6 :
Comparison of performance measures for both the proposed model and the RMCNN model.