Applied Latin Hypercube stochastic method to quantify the uncertainty in groundwater equation model simulations

It is accepted that digital models simplify the physical reality that is the object of the modeling. Hydrodynamic modeling is an approach with high uncertainties in this context. Indeed, the deterministic modeling approach assumes the existence of a functional relationship between the observed variables. The variables are observed by a series of measurements riddled with errors. Because of this, there is always a significant amount of uncertainty associated with a hydrogeological model. This uncertainty can be associated with the conceptual model or with the data and parameters associated with the different components of the model. Some model parameters such as hydraulic conductivity and recharge are particularly susceptible to uncertainty. Stochastic modeling of the hydrodynamics of a groundwater reservoir is an adequate response to allow us to take a step back on the significance of the results. The study is based on the development of a direct problem-solving model which represents the best estimate of the real hydrodynamic system. This model is used to make predictions. With a stochastic approach, a set of models is constructed where each model, as a whole, is considered to be equally likely. Each model is then used to make the prediction or simulate a given scenario. The MODFLOW-STOCHASTIC-GMS code allows us to do randomization simulations (Latin Hypercube method) and with parameter indicators.


Introduction
Latin Hypercube Sampling (LHS) is a frequently used sampling method in the areas of simulation experiment design, computational physics [1,2], computational chemistry [3,4], computational biology [5,6], and other similar disciplines uncertainty analysis, adaptive metamodeling, reliability analysis, and probabilistic load flow analysis [2,3,[7][8][9]. The LHS method is preferred by many researchers when it comes to sensitivity analysis, for example, Helton et all [10] used this method to study the propagation of uncertainty in analyses of complex systems, while Manache et all used the LHS method to sensitivity analysis of a water-quality model [11], This method is still preferred in many new researches in various fields [12][13][14]. LHS has a superior space-filling impact, resilience, and convergence character when compared to other random or stratified sampling methods.
The goal of the LHS extension is to create a larger LHS while keeping the current LHS (or the original LHS). In at least two situations, sampling should be increased, especially for time-consuming simulation systems. Sequential sampling, adaptive metamodeling, and other methods are only a few examples. If the original LHS is subsequently shown to be too small, the second approach is to investigate LHS enlargement, and creating a new LHS with bigger sampling points without the original sample points would be time-intensive. However, due to the LHS structure, expanding the size of an original LHS while retaining its stratification features is challenging.
One example is the integral multiple extension, in which the new LHS is integral times the size of the original sample. For stratified sampling methods, such as LHS, integral-multiple extension algorithms are used, were suggested by Tong [15]. Sallaberry et al. [16] proposed an LHS with correlated variables two-multiple extension method. Later publications titled "nested Latin hypercube design" [17] and "nested orthogonal array-based Latin hypercube design" [18] described two similar methods. A two-layer nested Latin hypercube design is a Latin hypercube design that contains a subset of a smaller Latin hypercube design. The K-extended LHS technique [19] is a particular integral-multiple extension method in which the new LHS includes K smaller LHSs. Vorechovsky [20][21][22] published several similar studies in which The new sample size is much larger than the old one. Using integral multiple extension methods allows you to get a more rigorous LHS while retaining all of your original sample points. The objective of this work is the implementation of the method of a method of stochastic resolution of the diffusivity equation based on LHS, we are interested in this work to simulate random fields of the parameter of the hydraulic conductivity and to solve for each given realization, the flow equation by classical methods of numerical resolutions (finite elements, finite differences, ...).

Stochastic description
Consequently, as in many fields of Physics, we are led to define macroscopic quantities called local quantities in hydrogeology. Two approaches coexist for the definition of the local properties of a porous medium: the deterministic approach based on the theory of R.E.V. "Representative Elementary Volume" and the stochastic approach. Here we describe the first approach. This simply consists of defining a local (macroscopic) quantity using an appropriate spatial mean on a volume element; this local quantity being associated with the centroid of this elementary volume. The geological medium can therefore be seen as a continuous medium. It is implicit in this approach that the characteristic length of this elementary volume ( ) satisfies the inequalities: << << where represent the length scale of the geological domain studied.
To be more precise, let us define ( ) as the volume of a spatial domain included in (the studied geological domain) having for the center, the position vector and ( , ) a tensor quantity, vector or scalar in a porous medium. The spatial mean value of u over at time t is defined by: This usually depends on the size, shape, and orientation of ( ) at time t. So that depends only on and , we have to define a volume 0 between two spheres of volume, This generally depends on the size, shape, and orientation of ( ) at time . So that depends only on and , we must define a volume between two spheres of volume If we can find for 0 common limits and at any point of the domain , then can define a field ( , ) through and treat as a continuous medium for the quantity . The volume is the representative elementary volume (R.E.V.). In the stochastic approach, any property of the geological medium (such as geometry or physical quantities) is treated as a random function of space. The probabilistic tool works as a great simplifier in the sense that the diversity of the geometry of a physical quantity of a porous medium is reduced to the knowledge of average and higher-order moments. In addition, the stochastic approach was able to justify the differences observed (for example) between the dispersion coefficients measured in the laboratory and those measured during experiments. De Marsily [23] uses the simple example of porosity at a midpoint, at the laboratory scale. This variable, noted n, takes the value = 0 when the point is located in a grain and = 1 when it is in a pore. FIG. 1 thus represents an embodiment of a random process (the distribution of the grains) characteristic of the porous medium. We suppose then that we have a number of realizations of this medium. At any point , we have a population of values of the variable porosity for which we can calculate the statistical properties: mathematical expectation, variance, moments of higher order. In the real case, however, we only have a porous medium to study, that is to say, that a single realization of the random processes that characterize it. For example, we obtain the expectation of the porosity at a point P defined by: Example of an implementation of the "grain distribution" variable (high porosity with large spaces and small spaces).

Groundwater flow equation, stochastic approach
By considering the properties of the natural formation studied as random spatial functions associated with a statistical structure characterized by its moments (mathematical expectation, variance, and spatial covariance law), the stochastic approach makes it possible to integrate this heterogeneity into the very formalism of considered equations. The resolution of the equations thus obtained, however, requires the use of approximations and simplifying assumptions leading to analytical solutions which may differ from one author to another (see for example [24][25][26]. Therefore, these analytical solutions apply in principle only to specific cases (uniform flow, infinite or semi-infinite medium, low heterogeneity in particular). Their validation, therefore, requires recourse to in situ experiments (for example [27]) or to numerical simulations [6,[28][29][30][31][32][33]. The results of these different works show that these solutions can be applied to cases where the spatial variability of ℎ is low. Pursuing the goal of extending the potential field of application of the results from stochastic theory, some authors have focused on solving the flow and transport equations in non-uniform flow cases [32,34].
We take the scale of the natural formation, which in most cases consists of different horizontal strata of depth varying from a few meters to several hundred meters. On this scale, the properties of the medium, as well as the flow variables, are magnitudes calculated or measured on volumes large enough to ensure that they are macroscopic concerning the scale of the pores. These properties can be considered deterministic and attached to each point in space. The medium and the fluid are assimilated to a continuous medium where one will be able to represent the properties and the macroscopic variables like vectors or continuous spatial tensors. = is the kinematic porosity. We consider that the flow problem is of direct and unconditional type, that is to say, that we propose to try to identify and (through their statistical properties). The stochastic methods of solving this problem are based in particular on the properties of the spatial distribution of hydraulic conductivity. The experimental data compiled in [35] allowed us to deduce that the hydraulic conductivity ( ) is highly variable and should be characterized by a logarithmic distribution density [36]. Freeze's analysis concludes that the distribution that best fitted the data was a lognormal distribution for K; which means. Y = ln(K) The flow problem was stated as follows by Dagan [37]: "given the domain W and the properties of the medium in the form of random functions (eg: ), it is necessary to determine the random spatial function which satisfies the flow equation and the boundary conditions ".

Latin Hypercube simulation
A stratified Monte Carlo sampling technique is known as Latin Hypercube Sampling (LHS) (MC). The sample region is partitioned in a certain manner by dividing the range of each component of . Only the case where the components of are independent or can be transformed to an independent basis will be examined. Sample generation for linked components with a Gaussian distribution is also straightforward. To generate a sample size from the variables 1 , 2 , 3 , . . . . . . 4 , LHS operates in the following manner. The range of each variable is partitioned into non-overlapping intervals based on equal probability size 1/ . In terms of the probability density in the interval, one value is selected at random from each interval. The values for Once the segments are defined, each parameter is then randomly sampled until the true value is found within each probability segment. The random numbers for each parameter are combined with the random numbers for the other parameters, such that all possible combinations of the segments are sampled. The total number of runs of the model is the product of the number of segments for each parameter. We consider a heterogeneous aquifer characterized by two zones of distinct hydraulic conductivity , If we decompose each zone into three segments. The total number of combinations would be 2 3 equals 8 possible solutions. From which we have framed the total area in the probability curve taking into account the specificity of the maximum and the minimum to have the maximum range of parameters while maintaining the best chance for the stability of the model.  (Figure a1 and Figure a2): respectively represent the histogram of the probability density function (PDF) and the cumulative distribution function (CDF). (Figure b1 and Figure b2): represent respectively PDF and CDF of ( ). Figure 2a and figure 2b show an example of a PDF with parameter ℎ according to the Log-normal law as well as the corresponding distribution ( ℎ). We will also define in the parameterization of LHS a variation interval for ℎ to keep the values in an acceptable range. The sampling of values of the parameter ℎ is based on the results of Table 1, specifying the mean, the standard deviation, and the coefficient of variation. We have chosen to subdivide each distinct area of hydraulic conductivity by three segments as mentioned in figure 2. We then have 8 possible realizations and consequently 8 possible stochastic solutions for the problem some of which have shown a great agreement. at the reference hydraulic heads (Figure 3).

Conclusion
This work implements the stochastic model simulating the hydrodynamic behavior of the ground flow equation based on the probabilistic distributions of the permeability parameter. This is made possible through the use of the LHS method and criteria on the choice of the type parameter in the first place (hydraulic conductivity). The use of the LHS distribution model allowed, in the beginning, to focus on the type of model likely to better fit the hydrodynamic parameters calibrated during the direct modeling phase and most vulnerable to estimation uncertainties, and for guided towards the choice of the stochastic solution.