An Analysis of Image Enhancement Effects on Convolutional Neural Network-based Pulmonary Tuberculosis Detection

. Pulmonary Tuberculosis (TB) is a primary global infectious disease. Diagnosing TB patients involves medical examination and chest X-ray (CXR) imaging. This CXR image creates an opportunity to utilize machine learning to help physicians and radiologists diagnose TB suspects. Due to the inconsistency of image quality, image enhancement is one of the preprocessing steps to overcome the poor quality of the image. This study examines the effects of several image enhancement techniques, i.e., Histogram Equalization (HE), Contrast Limited Adaptive Histogram Equalization (CLAHE), and Fast Fourier Transform (FFT). These enhanced images are input for a Convolutional Neural Network (CNN). InceptionV3 is a transfer learning architecture with ImageNet as the pre-trained model. The image dataset consists of 3,500 normal and 3,500 tuberculosis CXR images. The best performance, in terms of accuracy and processing time, is achieved by the CLAHE enhancement technique, increasing accuracy by 4.57% compared to the original images as input and a processing time of 5.6 ms faster per testing image. A deeper analysis shows despite FFT achieving high performance, the processing time increases by 14.4 ms compared to the original image processing time. This study concluded that each image enhancement needs to consider the characteristics of the images.


Introduction
Pulmonary Tuberculosis (TB) is an infectious disease caused by Mycobacterium tuberculosis, affecting human lung tissue.Infection occurs by inhaling infectious particles from close contact with infected individuals [1].According to the World Health Organization (2022), pulmonary tuberculosis remains a major global infectious disease, ranking second only to COVID-19 during the COVID-19 pandemic.In 2021, an estimated 10.6 million people worldwide were infected, with a 3.6% increase in cases per 100,000 population compared to 2020.The number of tuberculosis-related deaths also rose from 2019 to 2021, with an estimated 1.6 million deaths recorded by the end of 2021 [2].
Diagnosing suspected TB patients typically involves a medical history examination and a chest Xray (CXR) image.The results of the CXR image assist physicians in diagnosing various respiratory diseases, including pulmonary tuberculosis.When analyzing chest X-ray images, doctors focus on segmenting images showing infection indications.Infected lungs typically exhibit lesions and grayish-white shadows [3].However, there are often errors in chest X-ray screening, which has led to utilizing machine learning techniques for image processing, segmentation, analysis, and accurate tuberculosis detection [4].Machine learning with image analysis has been widely employed to assist physicians and radiologists in enhancing diagnostic decision-making.Deep learning techniques, such as Convolutional Neural Networks (CNNs), have also been utilized because they can leverage large datasets and extract high-accuracy data features.CNNs have demonstrated good performance and can be employed to aid in tuberculosis diagnosis [5].
This problem leads to inconsistent image data quality that can be influenced by various factors [6].In neural networks, achieving optimal results necessitates good image data.This realization has led to effectively implementing image enhancement techniques to improve image quality before processing.Image enhancement improves image features, allowing computers to study dynamic feature coverage effectively.It also reduces noise and blurring, hindering model performance [7].This study aims to explore the impact of image enhancement on tuberculosis detection using a CNN-based model and to analyze the effects of the enhancement on the accuracy and processing time of this CNN-based tuberculosis detection.

Related Works
Related research has been conducted on TB detection using chest X-ray image data.In [8], the author studied TB detection using deep machine learning, segmentation, and visualization.The results of this research compared nine CNN architectures.In [9] and [10], the authors of these papers also conducted studies ICIMECE 2023 https://doi.org/10.1051/e3sconf/202346502054E3S Web of Conferences 465, 02054 (2023) using CNN with the latest CNN architectures for TB screening with chest X-ray images.A study was also conducted on using transfer learning in tuberculosis prediction [11].
Regarding previous research on image enhancement, [12] demonstrated that enhancement techniques such as contrast-limited Adaptive Histogram Equalization (CLAHE) and Histogram Equalization (HE) significantly impact the training data in deep learning models, resulting in increased accuracy.In [13], the author conducted research on the effect of image enhancement methods on deep learning models that aimed to evaluate the impact of two methods, namely Unsharp Masking (UM) and High-Frequency Emphasis Filtering (HEF).In [14], the author employed various spatial and frequency techniques for image enhancement in COVID-19 diagnosis models.In [15], the author also conducted research on the utilization of Gray Level Co-occurrence Matrix (GLCM), Discrete Wavelet Transform (DWT), and Local Binary Pattern (LBP) algorithms.

Dataset
In this experiment, we use the dataset in [8] that consists of 3,500 normal CXR images and 3,500 tuberculosis CXR images gathered from several CXR image datasets.The original size of the images is 512 × 512 pixels.In order to accommodate the input requirement of the model, these images are resized into 300 × 300 pixels.After resizing, these images are split into 75% training, 15% validation, and 10% test data shown in Table 1.

Image Enhancement
In this study, there are three enhancement techniques used, i.e., Histogram Equalization (HE), Contrast Limited Adaptive Histogram Equalization (CLAHE), and Fast Fourier Transform (FFT).HE and CLAHE are used as contrast enhancement, while FFT is expected to perform a sharpness enhancement.These enhancements are used in order to see whether the contrast or sharpness of an image will be suitable for the CNN model.The diagram proposed is shown in Figure 1.

Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN) is a specialization of neural networks designed to process data with grid-like topologies, such as 2D-pixel grids found in image data [20].This type of neural network is based on mathematical convolution operations.InceptionV3 architecture developed by Google [21] is utilized for this study.The usage of this specific CNN due to its transfer learning capability with ImageNet as a pre-trained model is shown in Figure 3.The input images for this architecture are 300 × 300 pixels and normalized.

Results
This study uses three types of images as CNN's input image, i.e., original, HE, CLAHE, and FFT-enhanced images.These images are fed into the CNN architecture with transfer learning from ImageNet as default for the InceptionV3 model.The performance of these images is measured using the model's accuracy and loss.The accuracy is measured using the formula as follows: where TP is True Positive, and FN is False Negative.
Regarding the hyperparameter, this study conducted a manual hyperparameter tuning.The tuning process regards the suitable hyperparameter with balanced accuracy, loss, and processing time per image using original images in milliseconds (ms).The parameter used for the InceptionV3 model is shown in Table 2.
The focus of this study is to explore the effects of each enhancement in CNN-based pulmonary tuberculosis detection.In this tuned hyperparameter, the original image has a 94,98% training accuracy and 94,14% testing accuracy.The processing time for each image in training is 69.3 ms, and testing is 74.2 ms.

Contrast Enhancement
In the contrast enhancement scenario, HE and CLAHE are performed to the images and fed into the CNN as input.In Figure 4, it is observed that CLAHE gives more clarity than HE.In CLAHE, the images are enhanced with clip limit 4.0 and tile grid size 16 × 16 as default.

Sharpness Enhancement
For the sharpness enhancement, this study performs the Fast Fourier Transform (FFT) with three high-pass filters, i.e., Ideal High-Pass Filter (IHPF), Butterworth High-Pass Filter (BHPF), and Gaussian High-Pass Filter (GHPF).These filters enhance the images in the frequency domain, as can be seen in Figure 6.Additionally, it is found that the processing time is varied for each high-pass filter.The processing time for FFT-IHPF is 85.3 ms on training per image and 88.6 ms on testing per image.FFT-GHPF uses 87.3 ms on training per image and 91.4 ms on testing per image.Lastly, FFT-BHPF uses 87.7 ms on training per image and 87.3 ms on testing per image.This increasing time is due to noise or artifacts that FFT enhancement results in, potentially making it harder for CNN to extract the feature.This result shows that FFT-IHPF enhanced images perform better in the CNN with higher accuracy and faster processing time.

Conclusions
This paper implements three image enhancement methods: HE, CLAHE, and FFT.These enhanced images are used as input on InceptionV3 architecture with transfer learning using ImageNet.The best performance, in terms of testing accuracy and processing time, is achieved using CLAHE enhancement.This technique increases the testing accuracy by 4.57% in comparison to the original images as input.Similar to accuracy, the processing time is also 5.6 ms faster per testing image.The comparison of the performance of each enhanced image is shown in Table 3.
In regards to the accuracy of FFT techniques, despite it showing a high accuracy result, the processing time is increased.The highest performance, in terms of testing accuracy, is shown by FFT-IHPF, with an increase of 4.15% in comparison to the original images as input.Despite this significant increase, the processing time per testing image also increased by This study concludes that performing image enhancement can lead to several effects.On this CNNbased pulmonary detection, contrast enhancement is superior in accuracy and processing time, with CLAHE being more suitable.Compared with sharpness enhancement, although FFT achieves a high precision, the processing time is increased.Further study is advisable to study other CNN architecture and add classes with a suitable enhancement.
The Department of Computer Science and Electronics, the Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Indonesia, funded the dissemination and publication of this research.

Fig. 3 .
Fig. 3. Proposed CNN for Tuberculosis Detection the CNN architecture to learn the feature of each image in comparison to HE-enhanced, as shown in Figure5.

Fig. 7 .
Fig. 7. (a) FFT-IHPF (b) FFT-GHPF (c) FFT-BHPF Based on the observation of several samples of these enhanced CXR images in Figure 6, the enhancement result is generally similar across three high-pass filters performed.Despite this observation, the performance of CNN with the input of these enhanced images is different, as shown in Fig 5. FFT-IHPF achieves 97.24% training accuracy and 98.29% testing accuracy.FFT-GHPF achieves 98,43% training accuracy and 98.14% testing accuracy.Meanwhile, FFT-BHPF achieves 98.71% training accuracy and 97.86% testing accuracy.Based on this result, it shows that each high pass filter 14.4 ms compared to the original testing image processing time.A deeper examination through three FFT techniques shows that despite different high-pass filters being performed, the results of CNN training can differ.This indicates that each filter generates enhanced images with other characteristics.ICIMECE 2023 https://doi.org/10.1051/e3sconf/202346502054E3S Web of Conferences 465, 02054 (2023)

Table 3 .
Overall Performance