Deep learning for extracting micro-fracture: Pixel-level detection by convolutional neural network

. Hydraulic stimulation has been a key technique in enhanced geothermal systems (EGS) and the recovery of unconventional hydrocarbon resources to artificially generate fractures in a rock formation. Previous experimental studies present that the pattern and aperture of generated fractures vary as the fracking pressure propagation. The recent development of three-dimensional X-ray computed tomography allows visualizing the fractures for further analysing the morphological features of fractures. However, the generated fracture consists of a few pixels (e.g., 1-3 pixels) so that the accurate and quantitative extract of micro-fracture is highly challenging. Also, the high-frequency noise around the fracture and the weak contrast across the fracture makes the application of conventional segmentation methods limited. In this study, we adopted an encoder-decoder network with a convolutional neural network (CNN) based on deep learning method for the fast and precise detection of micro-fractures. The conventional image processing methods fail to extract the continuous fractures and overestimate the fracture thickness and aperture values while the CNN-based approach successfully detects the barely seen fractures. The reconstruction of the 3D fracture surface and quantitative roughness analysis of fracture surfaces extracted by different methods enables comparison of sensitivity (or robustness) to noise between each method.


Introduction
The discontinuity in the rock mass is a major factor that governs mechanical stability and hydraulic properties, and must be considered in groundwater flow, unconventional energy development, and enhanced geothermal system. To improve the efficiency of deep geothermal power generation, it leads to the initiation of additional fractures in the rock mass using hydraulic stimulation. Therefore, the recognition of a fracture and its spatial development is vital to understanding the fluid-driven fracturing mechanism in hydraulic stimulation.
High-resolution X-ray computed tomography (CT) enables internal visualization using Xrays to characterize material components, thereby obtaining three-dimensional specimen images of materials including fractures. X-ray CT images have unavoidable noise in the image or low image-contrast between targeted objects and surroundings, which makes fracture detection difficult with conventional algorithms. Several algorithms of the image segmentation have been proposed for decades, such as separating an image into multiple elements and then performing semantic segmentation [1][2][3][4] or detecting fractures in a hybrid approach that integrates conventional image processing and machine learning techniques [5][6][7][8]. Despite various applications of conventional image processing and machine learning algorithms, it is still demanded a robust fracture detection method independent of the morphology and illumination conditions included in the image.
Convolutional neural networks (CNNs) of deep learning techniques are attracting attention in the computer vision field by its outstanding performance in the object recognition field. Recently, a lot of studies using deep learning to extract fracktures quickly and accurately from images of various infrastructures have been published, such as crack identification and classification of concrete structures, and crack detection in the pavement [9][10][11][12]. The CNNs specialized for the image are performed by an object detection method through robust feature extraction in the image based on the spatial correlation of elements constituting the image.
On the other hand, in-situ rock not only has anisotropy of the material itself but also disturbances in the coring process are inevitable, which hinders the evaluation of fracture behaviors. In this study, therefore, cement paste and mortar specimens artificially made to be relatively homogeneous and isotropic are used. Then, X-ray CT imaging is performed on the fractured specimens by the laboratory hydraulic fracturing tests, and the CNN-based deep learning network is developed to extract micro-fractures from the 3D X-ray CT images of the specimens. For the quantitative analysis of fractures, pixel-level semantic segmentation must be supported, and we adopted a fully connected network (FCN), which is a modified model of conventional CNN. The proposed model consists of an encoder that uses ResNet-152 [13] as a backbone network that has achieved excellent performance on image classification and a decoder network that contains concepts of deconvolution and skip connection. Finally, morphological analysis is conducted on a 3D fracture surface in the material quantitatively extracted by an Xray CT imaging and CNN-based deep learning model.

Formation of micro-fracture
The laboratory hydraulic fracturing test was adopted to synthetically generate the fracture on nominally identical cement paste and mortar specimens in place of the complex in-situ rock. The specimen was composed of the ordinary Portland cement, sand, and water which are mixed with a ratio of 1:1:0.4, and the mixture poured into a PVC cylinder with an inner diameter of 56 mm and length of 100 mm. The cylindrical mold housed a stainless-steel rod with a diameter of 5 mm and length 60 mm at the center so that the cured specimen was supposed to have a borehole when the mold was disassembled (Fig. 1).
It has been known that the fracture generated by the hydraulic fracturing tends to show a complex pattern as the viscosity of fracking fluid decreases. Also, our previous study showed that the aperture of fracture becomes wider when a more viscous fluid is applied, and the aperture decreases due to energy dissipation as the fracture propagates [14]. Therefore, we use only one fluid with relatively high viscosity (e.g., oil) to prepare the training data for developing a deep convolutional neural network capable of automatically detecting microfracture.

Data preparation
An industrial X-ray CT inspection system (EYE PCT-G3, SEC Co.) was used to obtain 3D μ-CT images of the fractured specimens. Fig. 1(a) shows a configuration in which X-rays are emitted from the source (left), pass through the specimen (center), and reaches the detector (right). In addition, the specimen is rotated 360 degrees and taken in all directions, and the reconstructed 3D specimen is shown in Fig. 1(b). Fig. 2(a) presents a 2D cross-sectional image which includes fractures hard to seen in the fractured mortarbased specimen. Note that the gray-level CT number in Hounsfield Unit (HU) represents the density of constituents so that the void and fracture are indicated by dark color and the solid mortar is denoted by a bright color (enlarged in right of Fig. 2(a)).
The pair of images consisting of the input image and the corresponding ground-truth that manually labelled fracture is shown in Fig. 1(b). These image pairs are used for training fracture detection network, and the 16bit gray level image has 1024 x 1024 pixels with dx = dy = 108 μ m.

CNN-based encoder decoder network
The computer vision has recently brought attention due to the powerful performance of the CNNs. Especially, the encoder decoder structure has been widely adopted to computer vision fields, such as sequence-to-sequence learning, autonomous driving technology, and semantic segmentation [15][16][17][18].
This study aims to extract the features of the fractures in the CT image so that the role of the encoder network that performs feature extraction is indispensable. To achieve maximum efficiency with the acquired data, transfer learning is useful, which refers not only to the structural frames but also to the initialization with pretrained weights on the large benchmark database (e.g., ImageNet). In this study, a deep residual network with 152-layers [13] outstanding on both classification and detection in ImageNet Large Scale Visual Recognition Competition (ILSVRC) was used to compose the encoder network.
The size of the feature map unavoidably decreases through the encoding process leads to loss of image information. To compensate for this problem, a method in which a skip layer is fused to a transposed convolutional layer is fused to the decoder network.
The 3D CT images are a series of image data and we chose the training data at some intervals to avoid an overfitting problem during the training process. A sophisticated ground-truth generation for pixel-level accuracy is time-consuming and expensive work, so the training data was efficiently prepared using the image augmentation method.

Results and discussion
Performance verification using test samples is conducted to assess the field applicability of the fine-tuned CNN model for automated fracture detection. A total of 700 images completely separated from the training dataset participate in the test process. To evaluate the performance of fractures extracted by the proposed deep learning model, two image processing methods mainly used for semantic segmentation is selected: region growing and locally adaptive thresholding. It is noted that the fracture detection by the conventional methods shows plausible results, nevertheless the pre-and postprocessing required to achieve these results are not cheap.
We select four types of features that are prominent among the generated fractures to make it more intuitive to compare the qualitative performance of the extracted fractures themselves. The first row in Fig. 4 shows gray-level CT images which used as an input image. Below the original input images, manually labelled ground-truth images, extracted fractures by the two conventional methods, and predicted fractures by the CNN-based deep learning model are listed in this order. Branched fractures and parallel fractures with different apertures types results in poor performance when using conventional methods, as shown in Figs. 4(a) and 4(b). Pre-existed voids and artificially generated fractures cannot be distinguished by pixel-intensity value alone, thus it is confirmed that voids are included as part of the fracture by general image processing methods (Fig.  4(c)). Fig. 4(d) shows that hardly seen fractures due to its too thin aperture, hence the conventional methods tend to overestimate fractures or even capture ambient image noise. For all the various fractures patterns that can be occurred, the proposed CNN-based encoder-decoder model shows outstanding results that are difficult to distinguish from the manually labelled ground-truth images.

Conclusions
The conventional image processing methods fail to capture the disconnected and hardly seen fractures due to the pixel-intensity based algorithm. We used the X-ray CT imaging and CNN-based deep learning network to quantitatively extract fractures from the concrete paste and mortar specimens conducted by laboratory hydraulic fracturing tests. To overcome the spread noise and weak image-contrast between the targeted fracture and the background in the X-ray Ct images, processes of feature extraction and fracture reconstruction are executed sequentially through an encoder-decoder network. The proposed model has successfully extracted the various types of fractures, such as branched fractures, parallel fractures with different apertures, fractures passing through the void, and hair-like thin fractures. In addition, the void passing through the fractures are effectively excluded, thereby enabling quantitative analysis of the morphology and properties of the fracture surface. In this study, we adopted a CNN-based deep learning model to overcome the economic and technical limitations in the field of object recognition and segmentation, which confirmed to extract thin fractures with pixel-level accuracy.