Uncertainty Characterization for Soil Cohesion in a Project Site in Nasiriyah Using Bayesian Methods

. High uncertainties arias through the characterization of soil parameters because of the lack of data obtained from geotechnical reports. Reducing these uncertainties may improve the characteristic values of soil parameters. This research aims to probabilistically characterize a soil's cohesion parameter in Nasiriyah. The Bayesian approach has been applied to soil data obtained through a project in Nasiriyah. The soil at the site is classified as lean clay, and the soil cohesion has been evaluated using two Bayesian methods: the ordinary, normal distribution method (OND) and the Marcove Chain Monte Carlo-based Bayesian approach (MCMC) method. The previous knowledge utilized in the Bayesian approach was based on 20 boreholes, and the subjective probability approach has functioned in the prior probability distribution. The OND method deduced a mean value of cohesion of (195.9 kPa) and a standard deviation of (14.68 kPa), (COV) 7.49%. It was noted that the probability distribution has a more significant effect than the previous distribution on the posterior distribution. The MCMC method summarized the probabilistic description of the soil characteristic, through which it reached the mean and the subsequent standard deviation (167.49) kPa (109.8) kPa, respectively, and the coefficient of Variation (COV) was 65.6%. It is considered the most appropriate and common method, especially in high-dimensional data when the results are not well known because it can provide a probabilistic value for the not well-known data.


INTRODUCTION
Geotechnical engineering frequently deals with uncertainty, and engineering design must take it into account [1] (e.g., existing model uncertainty and inherent variability) [2].In geotechnical engineering, the evaluations of in situ geotechnical parameters are typically limited and sparse due to terrain limitations and high testing expenses [3].It is urgent to find a solution for this lack of region-specific data to create accurate regional models [4].Results of uncertainty quantification, such as statistics and probability distributions of geotechnical characteristics, heavily rely on the accuracy of the information gathered during the site investigation [5].The Bayesian approach presents a theory that explicitly addresses these problems through a Bayesian framework, assuming that geotechnical characteristics are random variables within a specific distribution to estimate Bayesian probability density (Pdf) depending on previous information and fieldmeasured data followed by probabilistic modeling for each of the spatial variability of soil properties and uncertainty [6].The Bayesian probabilistic method helps calculate the input parameter uncertainty [7].It is a more widely used method to obtain post-distributions of geotechnical parameters.Because the posterior distributions are complex in the solution, several Bayesian methods were used to help face this complexity, including the MCMC method and other Bayesian methods that developed reliability in geotechnical parameters [8].
This study focused on characterizing the uncertainty of cohesion of project soils in Nasiriyah (lean clay), which were estimated through the SPT.Because of the limited data and the resulting uncertainty, this research dealt with two Bayesian methods, the MCMC and OND methods.Both methods relied on prior knowledge estimated through the subjective probabilistic method and on measured project data.For statistical analysis and determination of the characteristic value of cohesion (C u ).Each method developed the post-distribution within the MATLAB code, which determined the µ,σ, and λ posterior for them.

METHODOLOGY
The methodology of research consists of two parts.The first part is the study area, which includes the project location, the geology of the site, and the sources of previous knowledge.The second part contains the theoretical base of Bayesian approaches, the methods of Bayesian used in this study, and the theoretical framework of the prior probability distribution.

Study Area 2.1.1 Project Location
The project site is located 2.7  south of Nasiriyah, the center of Thi-Qar Governorate.Thi-Qar Governorate shares its internal borders with the provinces of Basra, Wassit, Muthanna, Missan, and Qadissiya.Nasiriyah is around 370 km to the southeast of Baghdad at latitude 31°14' N and longitude 46°19' E. It has a total area of 13.55 2 .SPT was retrieved from the site investigation report related to this area.Four boreholes to a depth of 40 m have been performed, and SPT has been conducted through drilling these boreholes (Figure 1).The large drilling depth was to obtain a wide range of soil properties and for a more reliable design.

Geology of The Site
Nasiriyah is located in southern Iraq, near the Euphrates River and the Mesopotamian plain.The soil in this region is a floodplain formation consisting of clay, silt, and sand, where the silt forms 60% of the deposits.The silt soil and sand settle in the swamps, the mud runs below the Shatt al-Arab, and one million tons of sediments are dried up annually (12,000 years) in these rivers that flow from the northwest to the southeast of Turkey through this basin [9][10][11].The soil in this region's shallow depth belongs to the Holocene epoch.

Previous Knowledge
A considerable amount of research is interested in characterizing the engineering soil parameters of soil based on many site investigation reports (e.g.[12][13][14][15][16]).The data of this study, which amounted to about 20 boreholes, were collected from various city projects.The location of boreholes is presented in Figure 2.  Figure 3A shows the amount of SPT N for the previous data obtained from the geotechnical reports for seven projects from the city of Nasiriyah, which reaches a depth of 25 m.The average SPT N value was calculated for all data at each depth.Figure 3B shows the results of C u estimated based on SPTN, according to Decourt (1990).Number of cohesion values used at various depths vary, for example at depths of (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,22,23,24,25)  The collected data were presented in histograms to be more reliable and to provide a means to reduce the high uncertainty.It is noted in Figure 4 that 95% of the cohesion rate in most projects ranges from (50-150) kPa, which was estimated through the equations of the corrected values for SPT N according to Decourt [17].The data were taken into consideration as prior knowledge and expressed as θ i (μ, σ, λ).This is considered an essential step in the Bayesian framework to obtain post-estimations of soil cohesion in Nasiriyah and to estimate its characteristic value for getting more reliable geotechnical designs [18].There are several methods used to assess prior data.This research dealt with the method of subjective probabilistic assessment (SPFA) of soil parameters.

Bayesian Approach
For geotechnical engineering to be practiced, the characterization of the ground is necessary.When combined with other geotechnical metrics, such as test results from in-field and laboratory experiments and field monitoring data, Bayesian techniques offer a potent paradigm for evaluating the site conditions.By combining field and lab test data, the Bayesian technique allows for the probabilistic calculation of soil properties [19].The data and statistical parameters are both treated as random variables.The random field theory sufficiently illustrates the spatial heterogeneity contained in soils.One of the most well-known equations in the Bayesian approach is used to lessen the inherent variance.
The following definitions apply to the terminology used in the Bayesian approach: P(|).A posterior probability, P(X) is the probability of producing the observation, and P() is a prior probability.Let a vector ( , ) stand for a set of uncertain variables that will be updated with the observational data X.whereas: Because of the difficulty of solving the integrals in the second equation, this difficulty can be facilitated by relying on the Eq.3 as follows [8]: Geotechnical site reports are expressed as prior information, and both the engineering judgment and observational data are in the form of a probability function pdf to help describe   Distribution of soils as shown in the following equation: Where  ′ = measured data,   = Prior data Given the likelihood that it has many restrictions, particularly when there is a tiny sample [20].It is a crucial part of the Bayesian technique.It is expressed in the following way: X= { 1,  2,  3, … }   =denote observed values of the Properties of soil This research deals with the different Bayesian methods to reach the most accurate post estimations, depending on the previous information obtained and the likelihood function.The project data are applied to both the previous and likelihood of the cohesion soil parameter.These methods are discussed as follows.

Ordinary, Normal Distribution (With Known Variance)
This method is the most widely used for reliability and data analysis.Still, it is not appropriate in some data modeling because the distribution extends to the left end to the negative values.In most cases, these negative values are not accepted.Gelman et al. [21] used the known variance to express an ordinary, normal distribution (OND), which is provided by the following formula, where the mean (  ) and variance for the posterior probability distribution (  | 2 ) of  are obtained as follows:

Sequential Monte Carlo Technique (MCMC)
The MCMC algorithm is the sampling of probability distributions.The estimation of high-dimension inference is not possible mathematically; in this case, this problem can be addressed through MCMC.It is one of the most common and important Bayesian statistical methods, especially when conducting inference.It creates random variables of up to more than a thousand values to reach an appropriate estimate for each of µ and  of the limited data that has been relied upon.MCMC is a simulation of a series of random variable (e.g., C, SPT N) samples through a numerical procedure [22].An invertible density function that deviates from a normal distribution can be sampled using the Metropolis algorithm (MH).The metropolis algorithm is used for samples drawn from an unnormalized density function in which it doesn't have a standard distribution.The metropolis algorithm is written in such [18]: In the case of  starts from 1, extract a candidate from symmetrical proposal distribution, followed by determining the ratio of the densities.Assume  * as  (i) with probability min (r, 1); if not, assume  (−1) as () .If  = the required number of samples, stop the iteration; otherwise,  =  + 1 and start over.R = q(θ * ) q(θ (i−1) ) (13) Where   refers to the undrained cohesion of the soil from the site, the   can be modeled as a random variable at µ,, and ʎ as in the following equation: = the space separating either two random site points,  = is the correlation length,   is the regression coefficient between the undrained cohesion at points  and j.   is the component of C in the ith row and jth column, where   =    2 .As a result, the likelihood function is as follows: Where D stands for the observed values of x, and k is the amount of observations.The previous distribution of  can be expressed as follows, presuming that the components are statistically independent Lognormal distributions η and  are the mean and standard deviation.
Where   and   are the standard deviation and mean of ln (  ) respectively, as in the following equations:

Prior Probability Distribution
Defining the prior assumptions before viewing any data is the first step in Bayesian framework [23].The Bayesian approach requires a distribution based on prior knowledge.Prior knowledge includes general information about a hypothesis that may be relevant or unclear.It may be a priori knowledge of any type of distribution that accurately reflects the state of the model or parameter under study, which can be expressed as θ i (e.g., μ, σ, and λ).These antecedents are known as conjugate distributions [24].Previous knowledge can be split into informative prior information and non-informative prior information, depending on its accuracy and quantity.
Prior knowledge of the uniform distribution can be expressed as follows: Consider a model parameter θ i (e.g.μ, σ) The "subjective probability assessment" (SPAF) method was established in response to these difficulties and cognitive biases.Each action the engineer takes demonstrates the effective application of prior knowledge and the reduction of unfavorable effects and complications [25].It is developed using the cognitive process, which includes a number of stages (specification of assessment objectives, collecting relevant data, synthesis of the evidence, numerical assignment, and final confirmation), which helped in evaluating both µ and the previous σ of the soil property (cohesion).

Prior Distribution
The prior probability distribution of soil cohesion, calculated based on SPT, is based on the sites from which the data were collected, which amounted to 20 boreholes.Table (1) shows the highest and lowest values of this data for each of C u , SPT N.These data are based on the method of subjective probability evaluation to calculate each of µ, σ, and λ for the property of soil cohesion, as shown in Table 2.The prior probability cohesion distributions of µ, σ, and λ were estimated from these data, as shown in Figure 5.

Ordinary, Normal Distribution (With Known Variance)
Depending on the measured data on the site, as shown in Table (3), Four test boreholes summarize the drilling test site for the project.Through this data, it is possible to find or estimate the probability distribution of soil cohesion, and by applying the equation (7) in the MATLAB code with a rate of 223.4 kPa and σ of 34.67 kPa, its likelihood represents the process of calculating.The probability distribution of C u shown in figure 6.  Utilizing the probability distribution of the measured data and prior knowledge, the µ and σ of the posterior probability distribution were obtained using the two equations (11,12).It was developed using MATLAB code to represent the probability distribution, the prior distribution, and the posterior probability distribution to evaluate soil parameters represented by cohesion.Figure 7 shows both the mean and the posterior standard deviation according to Bayes theory, through which it is noted that the probability distribution of prior knowledge has a more significant effect on the posterior probability distribution.The estimated results of C u is lowest than the measured value of C u because of the effect of previous knowledge.

Monte Carlo Method (MCMC)
This method was based on the C u data in the project was calculated based on SPT N using Decourt [17], Terzaghi, and Peck [26] methods for clay soils.TheC u , SPT N was estimated with a depth of up to 40 meters, as shown in Figure 8.The maximum soil cohesion rate for the project is up to 526.5 kPa, and the minimum is 43.75 kPa.Through this method, equivalent samples of up to 120,000 samples were created for the C u , which is considered a random variable, and depending on the MH method to obtain a high number of samples that are difficult to obtain directly.The creation process relied on the inputs in the MATLAB code, where the input included the previous data for each of the µ,σ, λ, and measure data with depth, and through which the average was reached, σ and λ posterior as shown in Figure 9.This also helped to reach the histogram for them, as shown in Figure 10.

CONCLUSIONS
This paper deals with two Bayesian methods to describe the uncertainty resulting from the limited geotechnical data measured from one of the projects in Nasiriyah for clay soils, with reliance on geotechnical reports for different locations in the city as previous information, which amounts to approximately 20 boreholes, for which the soil parameter of undrained cohesion was calculated based on SPTN according to Terzaghi and peck [26] for clay soils.Its value ranges from 15 kPa to 187.5 kPa at depths of up to 25 m due to the weakness of laboratory data that was observed through the project report.
The OND shows a posterior probability distribution with a µ, σ (195.9, 14.68) kPa, respectively, depending on the previous distribution and the probability distribution of the site-measured data.It is easy to apply within MATLAB code to access subsequent cohesion evaluations.It is concluded that the probability distributions of the measured data have a greater influence on the post-distribution than the previous distributions.It is also possible to note that the µ of the posterior probability distribution, when compared to the previous values, shows a significant difference in the results, indicating an increase in reliability.
The MCMC method, based on the MH approach, is able to obtain a high number of random samples in the distributions from which it is difficult to take samples directly and to solve the problem of the limited data of the soil coefficient C u in the site by creating 120000 samples from which an µ (167.49)kPa and a σ were obtained (109.8)kPa and λ (1.95).It is observed that the sample distributions are sparse and that the average characteristic reaches a high value and starts from a value of 50.This simulation can address the uncertainty caused by the lack of data and consider the most suitable and easiest method to obtain data with large and multidimensional numbers by applying the code in MATLAB.

Figure 1 :
Figure 1: The location of the three sites area.Figure2: The prior percentile concerning areas of data collection in Nasiriyah.

Figure 2 :
Figure 1: The location of the three sites area.Figure2: The prior percentile concerning areas of data collection in Nasiriyah.

Figure 4 :
(a) Probability distribution function (PDF) of mean cohesion and (b) Cumulative probability (CDF) of mean cohesion.

Figure 6 :
Figure 6: The likelihood distribution of the soil cohesion for the measured data.

Figure 7 :
Figure 7: The previous probability distribution, the likelihood distribution, and the posterior distribution of the cohesion.

Figure 8 :
Figure 8: Cohesion rate with depth and standard penetration test SPT N.

Table 1 :
Prior Ranges of µ and σ of soil Characteristics of Thi-Qar.

Table 2 :
shows of the prior percentiles of the meanμ c and σ c , λ c .

Table 3 :
Measured data in the project.