Spatiotemporal deep learning approach for estimating water content profiles in soil layers

. Land subsidence associated with using natural groundwater resources for serving the growing population needs has been receiving extensive research attention in the literature over the past few decades. The water content fluctuation in the of subsurface soil layers significantly impacts the land subsidence. The key objective of this study is to predict changes in water content profiles in soil layers over a long period of time using a deep learning-based approach. A convolution neural network algorithm that is commonly used in Artificial Intelligence (AI) applications is modified in the present study for processing in-situ measurement water content profiles. The approach used in the proposed AI method has a distinct advantage for generating dynamic predictions based on the extracted spatiotemporal characteristics of the data. In addition, three different algorithms are compared with respect to time series prediction, including long-short-term memory (LSTM), multiple-layer perceptron (MLP) networks and autoregressive integrated moving average (ARIMA).


Introduction
The growth of plants and the agricultural productivity that is required for humans and animals' survival on our planet is dependent on the soil water content. In addition, soil water content information is a key parameter in the rational understanding of the soil-plantatmosphere water cycle [1][2]. In many regions of the world, groundwater levels have been continuously reducing due to using this valuable resource for supporting human needs. Due to this reason, there is a significant decrease in soil water content contributing to a reduction in water storage capacity resulting in significant soil or land subsidence.
Empirical formulas and linear regression equations are widely used to predict land subsidence based on soil water content measurements. Recently, multivariate linear relationships between soil water content and rainfall, as well as the temperature and saturation difference for estimating land subsidence have been investigated by several researchers. For example, Chen et al. [3] developed relationships for predicting soil moisture, precipitation, and drought periods. Along similar lines, Su-fang et al. [4] also developed a linear regression model. Estimating or predicting nonlinear variations based on linear regression models are * Corresponding author: sfazelmojtah@student.unimelb.edu.au typically associated with large errors and are found to be unsatisfactory because of their low accuracy. More importantly such methods are not capable of processing extensive data to provide reasonable and generalized relationships that can be used for both for small-and large-scale problems.
In recent decades, there is proliferation in the use of artificial intelligence (AI) models in all fields including geotechnical engineering for addressing challenging problems because rapid advancements in computation technology [5][6][7]. Deep learning (DL), a subset of AI is receiving more significant research attention compared to conventional AI models because it has the capability of learning many more complex features. Several successful applications of DL have been reported in the literature especially during the last decade [8][9][10]. The convolution neural network (CNN) approach is one of the most popular DL algorithms that is receiving wide attention because it is capable of recognition of images and rapidly gather extensive data. Based on image or vibration responses acquired on site, CNN approach has been used in geotechnical engineering applications, for example for health monitoring, crack detection [11], damage detection [12], and landslide mapping [13].
The purpose of this paper is to investigate the water content profile using a DL approach. By adjusting the CNN algorithm, time series and spatial location data can be processed for suggesting early warning of risks based on underlying features from the data. In this study, longshort-term memory (LSTM), multiple-layer perceptron (MLP) networks, autoregressive integrated moving average (ARIMA), and CNN algorithms are compared to examine the performance of models in both time and space dimensions.

Data pre-processing
Different types of sensors are typically embedded in the boreholes to gather data related to the measurement of water content. It is possible to construct a time-space matrix from measuring data from each datalogger and sensor package, as illustrated in Figure 1, using time and space dimension information. Using a time measurement frequency of 1 day, where n represents the number of monitoring periods. Narrow time intervals are more effective to understand changes with a higher sensitivity. However, such approaches generate extensive data that must analysed; in addition, the field measurements based can be costly based on the type of sensors used. The dimension of space is typically defined by the number of reading points along the borehole depth (in Figure 1, it is five).

Structure of the CNN
The spatiotemporal characteristics of the monitored water content data is used in this study for the purpose of water content prediction. As discussed in earlier section, AI model based on CNN structures can handle spatiotemporal features from images and texts from grid data (such as Figure 1). CNN has demonstrated outstanding data learning capabilities due to its unique architecture. As summarized in Figure 2, the CNN architecture used in this study is a combination of convolutional and pooling layers, as well as several fully connected layers.

Inputs and outputs
A long-term prediction approaches have the advantage of having enough time for applying precautions actions; however, accuracy is likely to decrease with an increase in prediction steps. Higher accuracy can be achieved with extensive data; however, the training time must be increased. One of the challenges is associated with field conditions that may interfere with data continuity. Therefore, in this study the key objective was directed towards predicting water content for the next 7 days based on prior 7-days of interpreted data from in-situ spatiotemporal characteristics.

Division of data
In many scenarios, several AI prediction models perform well on training sets, but poorly on test sets. This phenomenon is called overfitting, which will result in poor generalization of the model. Although there are many techniques for overfitting control, none have been able to totally avoid it. As one of the overfitting controllers, K-CV divides all the original data into K subsets, and K -1 subsets serve as training sets, with the remainder serving as testing sets [14]. To ensure both training and testing of each sample, the process repeats K times. It is thus possible to eliminate randomness of database division and preserve as much as possible of the original database's distribution characteristics. Combining K-CV with the prediction model improves generalization ability.

Indicators of performance evaluation
Model accuracy is generally assessed by performance evaluation. The root-mean-square error (RMSE) and mean absolute percentage error (MAPE) were used as selected demonstrating the validity of predictions in comparison with measurements in this study. Eqs. 1 and 2, respectively summarize the mathematical relationships of RMSE and MAPE: where is the predicted water content, ̂ is the observed data of water content, and N represents the total number of datasets. Figure 3 illustrates the three steps of deep neural network prediction: data preparation, training, and testing of the model. It is necessary to collect data regarding the water content, then to fit those data into the time-space matrix. As a result of interferences in monitoring, some data may be missing. The average value from adjacent reports can be used to rectify this, as well as other methods of input information. Training and testing subsets are divided using K-CV. To complete the training process, labels are added to samples. Images of some objects can be extracted from using shallow edge-based image features and deep shapes based on their training objectives in image recognition. CNN extracts time and space relations as features in water content prediction. It is then possible to compare the predicted results based on the input dataset and calculate the accuracy of the trained CNN. In this study, Python is used to run the prediction model.  Five Decagon EC-TM sensors are inserted into the borehole (Figure 1), which can simultaneously measure volumetric water content [16]. The drilling process was also accompanied by field sampling at 5-m intervals. To determine the index properties of soil in different depths, various tests were conducted (e.g., index properties, hydraulic conductivity, contact filter paper method, hanging column tests). Table 1 summarizes the results of the index properties tests.   Figure 6 shows the volumetric water content profiles at various depths highlighting representative values. The volumetric water content profiles over time typically increase in deeper layers suggesting more water availability compared to shallower layers.

Parameters for CNN
CNN implementation should consider size of convolution, pooling filters and training epoch hyperparameters. There is a wide variety of filter shapes that are used in the literature. There are fewer parameters in the network with a smaller kernel size, resulting in lower computation costs. Training data can be better captured when a kernel has a large 'local receptive field'.
By choosing larger shapes for pooling, we can significantly decrease the dimension of the data, which likely results in loss of information. In order to balance efficiency and effectiveness, the convolution filter kernel size is set to (5 9 5). Based on the data size, we choose (2 9 2) as a typical maximum pooling shape. Three dimensions define the input data for the model: the first represents the channels, the second represents the readings, and the third represents the time period. It is possible to reshape the input matrix, the same size as the original. In layer 6, the output matrix is converted into a vector, then flattened, which will result in a more efficient CNN model. A fully connected layer transforms the vector into the model output.
The use of several epochs leads contributes to inefficiency and an overfitted model. In such cases, the convolutional neural network model is effective on the training set but might show poor performance on the testing set due to the increased number of epochs. The training epoch of 200 is chosen for the assigned project, where CNN with small epochs cannot capture features in training samples. Figure 7 summarizes time series of water content results from CNN. The results suggest perfect fitting with relatively low errors. In other words, CNN is effective in extracting complex nonlinear relationships with respect water content time-series based on spatiotemporal characteristics. Figure 8 summarizes the variation of water content with respect to depth profiles at various times. The summarized results suggest that CNN predicts water content profile close to the field measurements. The promising comparisons also suggest that CNN is capable of making long-term forecasts in the space dimension based on water content profiles, implying competence to capture spatial features.

Comparisons
In this section, comparisons are provided using three different algorithms: namely, MLP, LSTM, and ARIMA in estimating water content time series with CNN model. MLPs are used for classification and regression tasks, in which hidden neurons are arranged in layers to extract data features [17]. In many fields of geotechnical engineering, LSTM has been successfully applied to sequential data [18]. An ARIMA forecasting model is another popular one, based on the integration of moving averages and autoregressions. The water content at different depths is viewed independently by all three algorithms when considering spatial relations. It is likely that the architecture of long short-term memory is the cause of the low efficiency. As long as the efficiency is resolved, it is worth examining whether it is feasible to capture spatiotemporal features thereafter.

Summary and Conclusions
The land subsidence and water content profiles prediction are important in many fields that include agriculture, geotechnical, and water resources engineering. For this reason, in this paper, we present a method that can automatically learn spatiotemporal features for reliable prediction of water content profiles based on CNN algorithms. To establish the feasibility and applicability of the proposed approach, an in-situ project is undertaken. Various conclusions derived from the study summarized in this are summarized below: -Unlike other networks, CNN architecture is capable of handling complex spatiotemporal characteristics. Water content profile prediction uses the two dimensions of an image to represent the space and time dimensions of the observed data and then employs CNN's deep-learning structure to recognize the image.
-In the prediction of water content profiles, CNN performs better than MLP, LSTM, and ARIMA. Other algorithms require more training time and do not achieve the same prediction accuracy as CNN. The differences between profiles at different depths are usually treated as separate sequences by MLP, LSTM and ARIMA, resulting in inferior performance since for each sequence, it increases training epochs and ignores spatial relationships.
-This paper proposes a CNN model that is particularly useful in real-time monitoring and early warning strategies. In other words, CNN can use the more recent in-situ data and predict water content profile in a dynamic fashion. Such approaches are promising for use in engineering and environment projects and use control strategies prior to reaching threshold limits and alleviate risks and hazards.