Breast Cancer Diagnosis from Histopathology Images Using Deep Learning Methods: A Survey

. Breast cancer is a major public health issue that may be remedied with early identification and efficient organ therapy. The diagnosis and prognosis of severe and serious illnesses are likely to be followed and examined by a biopsy of the affected organ in order to identify and classify the malignin cells or tissues. The histopathology of tissue is one of the major advancements in modern medicine for the identification of breast cancer. Haematoxylin and eosin staining slides are used by pathologists to identify benign or malignant tissue in clinical instances of invasive breast cancer. A digital whole slide imaging (WSI) is a high-resolution digital file that is permanently stored in memory for flexible use. This article will look at and compare how breast cancer cells are categorised manually and automatically. lobular carcinoma in situ and ductal carcinoma in situ are the two types of breast cancer. Here, detailed explanations of numerous techniques utilised in histopathology pictures for nucleus recognition, segmentation, feature extraction, and classification are given. The pre-processed image is utilised to extract the nucleus patch using several feature extraction approaches. Thanks to the great computational capability of the general processing unit (GPU), algorithms may be implemented effectively and efficiently. Deep Convolution Neural Network (DCNN), Support Vector Machines (SVM), and other machine learning methods are the most popular and effective computer algorithms.


Introduction
A major problem for women's health worldwide is breast cancer.Based on GLOBOCAN 2012 data, the standardized incidence rate (ASR) for invasive breast cancer (females) was 29.1 per 100,000 women per year in Asia, which is almost 30% of the population in the West [1,2].ASR per 100,000 women per year is 91.6 in North America and 71.1 in Europe WHO reported 2.09 million cases and 627,000 deaths worldwide due to breast cancer alone [2,9].This represents the incidence and mortality rates by race and ethnicity at that time.Invasive breast cancer and ductal carcinoma in situ (DCIS) incidence rates both increased significantly between the 1980s and 1990s.In the USA alone, 41,760 women will lose their lives to breast cancer in 2019 [1,2,8,9].
Hungarian and Russian pathologists and surgeons created and implemented the Interactive Histopathology Consultation Network (INTERPATH) initiative to conduct site-experimental casting of histological and cytological multimedia materials between hospitals and university institutes of pathology [3,4].A biopsy of the affected organ is necessary for the diagnosis and prognosis of the majority of serious illnesses in order to measure and recognize the deformation of cells (cytology) or tissues (histology) in comparison to normal cells [5].
The creation of potent computer-assisted analytical methods for the interpretation of radiological data has been made possible over the past ten years because to huge advances in processing power and advancements in image analysis techniques [8].The ability to digitize and save tissue histology slides as digital images has been made possible by the development of whole slide imaging (WSI) [4 -9].A full microscope slide is scanned using WSI, which is often referred to as virtual microscopy, to produce a single high-resolution digital file.Histopathologists have frequently been at the forefront of computer literacy in the field of diagnostics since pathology is one of the most computer-intensive medical screening fields [6][7][8][9][10].The FDA authorized digital imaging in 2011, making it the most common, sensitive, and widely used imaging technology to date.The most popular and practical method for helping the pathologist preserve slides digitally so they may be utilized for image analysis in the future is called digital whole slide imaging (DWSI) [7][8][9][10][11][12].The numerous specialized systems for breast cancer histopathologic categorization now in use are descriptive and based on the histology, cytological, or both, of the tumors [28].Screening is crucial for early identification of breast cancer since it usually has no symptoms when the tumor is small and manageable [1,5].Four categories of tumor exist: (2) Other (rare) Tumors (1) Lobular Carcinoma (3) Infiltrating Carcinoma (4) Carcinoma with High Metastasis.The WHO approach includes a categorization for tumors that is grade-based and is based on the advice offered by Bloom and Richardson.This is determined by the number of hyperchromatic mitoses and nuclei per high spectral vision (1, 2 or 3 points: few, moderate, many), irregular nuclei in terms of size, shape, and staining, and irregular nuclei in terms of tubule formation.The sum of these points is graded as I for points 3-5, II for points 6-7, and III for points 8-9 [6][7][8][9][10][11][12][13]28].

Fluorescent microscopic imaging methods
A breakthrough in optical imaging occurred during the past 20 years, and several imaging techniques are now accessible and in use.The findings for personal identification were more precise using polarization difference imaging [5][6][7][8].The two polarization components' peak intensities vary depending on the tissue type and illumination wavelength [2].
 The Fourier Transform-Infrared (FT-IR) imaging technique performs molecularlevel spectroscopic investigation.In the study and diagnosis of breast cancer, optical imaging plays a significant role [4].
 Pathologists further departed from the subjective assessment of morphological patterns with the introduction of multispectral imaging, which marked a turning point in the area of cancer and other illness diagnostics [5].
 The near infrared (NIR) autofluorescence pictures have been utilized to identify and photograph the cancer in real time, whereas usually, cancer diagnosis is made through a screening procedure in which biopsy tissue is examined under a microscope by a skilled pathologist [6,7].Using a straightforward Bayesian classifier based on 8 extracted features and completeleave-one-out cross-validation, this strategy is able to achieve a high degree of classification accuracy (98%) in the classification of 64 photos (n=64) [7].According to the 2019 study, table-I below displays the age-wise percentage of female patients from India and the USA who are dealing with the disease of breast cancer.Table 1 shows that women between the ages of 40 and 60 are those most affected by breast cancer.

Breast Cancer diagnosis methods
The pathological procedure of H&E staining is used to emphasize the tissue's structure.To improve the spectral clarity of a picture, paraffin is mixed with the tissue prior to H&E staining [4,5].Since the previous decade, the DWSI has made it easier for the pathologist to preserve the tissue pictures and utilize them to identify tissue structural deformation [4,6,[7][8]].An enhanced staining method called immunohistochemistry (IHC) uses antibodies to draw attention to certain antigens that are present in the tissue seen in Fig. 2(b) and (c) [19].IHC is frequently used in breast cancer to highlight the presence of the hormone receptors for estrogen, progesterone, and human epidermal growth factor 2 (HER2) as well as to assess the tumor's growth, for instance by highlighting the protein Ki-67, which is linked to cell growth [18].
The identification of cell nuclei for diagnostic reasons is a crucial component of many laboratory tests used in medicine [5,7,10,[20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35].Correct diagnosis and automated microscopy applications, such as cell counting and tissue architecture analysis, can both benefit from the precise placement of cell nuclei [12].Histopathology and image processing are the two main subfields of biomedical imaging [21].The many constituent pieces of the overall structure of histopathological imaging are depicted in Fig. 3.The findings were carefully evaluated across 25 typical photos, encompassing more than 7400 nuclei, using the most common approach, nuclear segmentation (15 in vitro images and 10 in vivo images) [12][13][14].The suggested segmentation algorithm's total accuracy is close to 86%.The accuracy was determined to be more than 94% when just over-and undersegmentation faults were taken into account [10].For stained picture I [18], the H&E staining may be statistically modelled in Eqn.  3 Preprocessing steps of an image

Thresholding
This technique converts an image's intensity level into a binary image I by giving each pixel a value of one or zero depending on whether its intensity is above or below a predetermined global thresholding T (3) [18].
1, ( ) 1 0, Use of computational techniques like the Otsu approach with minimal intraclass variation may be used to determine the threshold value (T) for optimum thresholding.

Morphology
Based on a set-theoretic approach, morphology treats images as the components of a set.The foreground pixel and structural component are denoted by Pf and S, respectively [20,25,27].
Where and  are the erosion and dilation (morphological operators) shown in Eqn. (4).
Between picture I and it's opening as, there is a difference caused by the white top-hat transform.
w f The difference between picture I and its close as-is is represented by the black top-hat transform shown in Eqn. ( 6).Level sets and Active Contour Models: Active contour models (ACMs), often referred to as deformable models, are frequently employed in picture segmentation.Using gradient information to optimize (minimal sense) energy function, deformable areas are utilized to determine the contour of objects in an image [20].The energy function E over the contour point c is used to define the generic ACM as presented in Eqn.(7).
In equation (7),   stands for the internal energy that controls the form and length of the contour,   for the effects that alter the local portions, and   for the object's past knowledge that controls the contour [38][39][40].

K-means Clustering
This technique of segmenting a picture into K clusters involves repetition.The goal of the vector quantization technique known as "k-means clustering," which has its roots in signal processing, is to divide a set of n observations into k clusters, each of which has as its prototype the observation with the closest mean.

Graph Cuts
Graph cuts (G-cuts) is actually set of wide algorithms, where an image is conceptualized and to be structured as weighted undirected graph  (, ) by representing nodes V with pixels, weighted edges E with similarity (affinity) measured between nodes [20] i.e. :  2 →  + Fig. 4. Graph cut technique graphical illustration [33]

Nuclei Segmentation
Numerous authors have addressed the segmentation of nuclei using various conventional approaches, including mathematical morphology, pixel classification, level sets, and graphbased segmentation methods [19,20,24,36], due to its crucial role in the automatic interpretation of stained tissue sections.While nucleus identification is often carried out by a spatially restricted convolutional neural network (SC-CNN), a new adjacent ensemble predictor (NEP) that is paired with CNN has been suggested to more reliably predict the class label of cell nuclei [25].

Detection of features
With the orientated fast and rotated brief (ORB) method, feature extraction and detection may be shown.The ORB effectively locates the picture's corners, whereas the fast component identifies features as regions of the image with a high difference in brightness.The most often used feature detection techniques are contouring, local binary pattern (LBP), and active localized contour model (LACM) [26][27][28][29][30][31][32][33][34].
The nuclei segmentation is crucial for classification and grading accuracy.Segmentation and nuclei detection are important processes in the processing of histopathologic digital images [41][42].The nuclei are crucial in and of themselves for assessing and verifying the presence of illness and the speed at which it spreads.A few of the feature detection techniques that have been employed successfully and efficiently are discussed in this study [33][34][35][36][37][38][39][40][41][42][43].
The centroid transform only works with binarized pictures and cannot be used to take advantage of addition cues since it identifies the seed points in the nucleus [20].Shape-based detection uses calculations of side length and area to identify nuclei.Disorder D of distribution r, which is referred to as shape change, is mathematically defined as in Eqn.(8) Standard deviation and mean of r are r  and r  , respectively [15].
High computation is needed for circular shape-based nuclei identification by Hough transformation [20].It is beneficial to increase accuracy by using a different technique to extract the feature from the critical areas.The structural, textural, and combination or hybrid techniques are intended to increase accuracy [17].

Classification methods from images
The data base makes it simple for the pathologist to make diagnostic and treatment judgements.The categorization of histology images is used nowadays.Because of their adaptability and successful outcomes, the majority of classifiers rely on support vector machine (SVM) and neural network (NN) classification techniques.

Support vector machine (SVM)
Laplace edge-based SVM classification that is compared to current applications.Figure (3) depicts the SVM classes' integration into features.The classifier can correctly detect other two kinds of cell nuclei with varied staining and scales in addition to the one type of nucleus pictures used to train it [12].1 ( ) s ( , ) Where  is the data point that has to be categorized,   is a support vector, N is the total number of support vectors,  is a constant used for training purposes, and   is the support vectors   's classified label.The Lagrengian multiplier is represented by the coefficients.In the instance of breast cancer, K is the kind of Kernel that specifically determines how the classifier classifies the nuclei; it does this by looking for malignant tissues by looking for nuclei.The specified kernel function and its features determine the many sorts of kernels.The following are some of the most used kernel functions: Linear ( When utilizing a kernel-based SVM classifier, we must be very careful since the usage of the kernel function depends on the kind of application [8,[20][21][22][23][24][25].Fig. 3. Support vector machine (SVM) classifier with hyperplane representation [12] , 011 (2023) E3S Web of Conferences ICMPC 2023 https://doi.org/10.1051/e3sconf/20234300119595 430 A supervised model called the soft-max classifier (SMC) generalizes logistic regression as a function of selected samples .

Convolutional Neural Network (CNN)
An input vector is mapped to an output vector using a convolutional neural network (CNN), which is made up of a series of functions or layers. = (;  1 , … .,   ) =   (  )°  −1 ( −1 )°… .°2 ( 2 )° 1 ( 1 ) The convolutional layer is a layer that multiplies pixels one by one using the kernel coefficients and the total of all the window's pixel [35,36,42].DCNN employs n x n unpadded convolutions with the specified kernel applied repeatedly, each convolution being followed by a rectified linear unit (ReLU) and a max pooling operation with a specific downsampling stride.The size of the kernel and its coefficients should be chosen depending on the application, for convolution, 3x3 or 5x5 order kernels are most often employed.The number of feature channels has to be doubled for each downsampling step.A 2x2 convolution (up-convolution) is all that is needed to simply reduce the number of channels needed at each stage of the expanding route.In order to make the network deeper, as illustrated in figure (4), the whole unit (convolution+ReLU+Max Pooling) is concatenated with the proportionally cropped feature map from the contracting, and the necessary number of convolutions with a certain kernel, followed by a ReLU.The loss of boundary pixels in each convolution necessitates cropping [26][27][28][29][30][31][32][33][34][35].
ReLU is a form of thresholding approach where the value of each pixel is either 0 or 1, and pooling is used to choose the pixel with the highest value or to take the average value of the window pixels.The window's dimensions should be chosen based on the need for precision.The act of flattening turns the pooled pixel matrix into a column matrix.The fully connected neural network is now receiving input from all of the column matrix pixels [36][37][38][39][40]. Now, we may get the categorized picture using the soft-max algorithm.Other existing neural networks are in use, like ResNet and U-Net, which are used specifically for segmenting biological images [30][31][32][33][34][35][36][37].

Training of deep learning network
The network is trained using stochastic gradient descent or any modified gradient descent approach using the input pictures and their related segmentation maps.The mathematical expression of the approximation function in Eqn.(11), which is dependent on activation [22][23][24][25][26][27][28][29][30].
Where  ∈  and  is the pixel matrix of an image, and   () is the activation in the kth feature channel at the pixel location.The approximate maximum function,   (), is defined as follows:   ()   for the class with the highest activation, and   ()   for all other classes with the lowest activation.N is the number of classes.The departure of  () ()from 1 is then penalised at each point using the following correlation by the cross entropy in Eqn. ( 12) [28].
Typically, 80% of the collection of pictures are used for training the network, while the remaining 20% are utilized for testing., This defines the insensitivity or the proportion of false categories.
Proportion of correct classification in all samples termed as accuracy.

Conclusion
In this study, techniques for processing histopathological images are covered.Additionally, a comparison of picture classification and image detection techniques is given in table II, together with information on each method's performance.Researchers can more quickly and accurately create early detection techniques that may reduce the risk of breast cancer via screening according to the study and comparison analysis.

Future directions
The deep learning is very popular and effective approach for the detection and diagnosis of diseases.The combination of deep neural networks with popular machine learning (ML) techniques is attaining attention of researchers for designing improved performance systems.
In future novel deep learning models can be designed for image segmentation by incorporating novel approaches related to feature fusion, edge aware, feature aware and region aware techniques.First of all, every learner must study about the basic concept of the deep learning, machine learning, and gain key concepts about the different layers.Generative adversarial network (GAN), autoencoder (AE), and segmentation networks are the popular deep learning models.
the pixel linked with the color space, S is the stain matrix, and new  is the RGB color space.Again, we write deconvolution D, and optical density φ is represented as in Eqn.(2).

Fig. 3
Fig. 3 Breast cancer diagnosis generalized steps and framework for computer aided diagnosis (CAD) of cancer.

Fig. 4
Fig. 4 Deep CNN network for image samples classification, having three main subparts (a) input image samples (b) feature extraction network and (c) fully connected neural network based classification [29].

Table 1 .
India and USA breast cancer age-wise patients' distribution with individual percentage and number of patients.

Table 2 .
Comparison of classification accuracy, feature extraction and detection methods, Precision defines the closeness of classification.Recall is also defined as sensitivity, which shows the proportion of positive categories.
On the basis of the performance parameters the used technics are discriminated, these parameters are-