Tracking the critical offshore conditions leading to inundation via active learning of full-process based models

. High-fidelity numerical models (full process based) facilitate accurate simulations of storm responses. For efficient implementation in probabilistic risk assessment, the simulation of a large number (>10,000s) of combinations of offshore hydrodynamic conditions (e.g. wave characteristics, offshore water level, etc.) is often necessary. To optimise this procedure, it can be of interest to concentrate the computation effort by only identifying the critical set of offshore conditions that lead to inundation on key assets for the studied territory (e.g., evacuation routes, hospitals, etc.). However, two limitations exist: 1. full-process based models have large computation time cost, typically of several hours, which often prevent from conducting several simulation scenarios; 2. the full-process based models are expected to present non-linearities (non-regularities) or shocks (discontinuities). In this study, we propose a strategy combining meta-modelling (of type Support Vector Machine) and active learning techniques to track with a limited number of long- (cid:2)(cid:3)(cid:4)(cid:4)(cid:5)(cid:4)(cid:6)(cid:7)(cid:8)(cid:5)(cid:9)(cid:3)(cid:10)(cid:11)(cid:12)(cid:5)(cid:13)(cid:4)(cid:8)(cid:7)(cid:12)(cid:14)(cid:15)(cid:7)(cid:16)(cid:2)(cid:5)(cid:12)(cid:5)(cid:16)(cid:11)(cid:10)(cid:7)(cid:8)(cid:15)(cid:12)(cid:17)(cid:8)(cid:7)(cid:18)(cid:13)(cid:3)(cid:4)(cid:19)(cid:11)(cid:2)(cid:20)(cid:21)(cid:7)(cid:22)(cid:14)(cid:15)(cid:7)(cid:19)(cid:15)(cid:23)(cid:15)(cid:10)(cid:13)(cid:24)(cid:9)(cid:15)(cid:4)(cid:12)(cid:8)(cid:7)(cid:11)(cid:2)(cid:15)(cid:7)(cid:19)(cid:13)(cid:4)(cid:15)(cid:7)(cid:13)(cid:4)(cid:7)(cid:11)(cid:7)(cid:16)(cid:2)(cid:13)(cid:8)(cid:8) -shore case, using the process-based SWASH model (computational time of 10 hours for one run). The dynamic forcing conditions are parametrized by storm surge S and significant wave height H s. We validated the approach with respect to a reference set of 400 long-running simulations in the domain of ( S ; H s). Our tests showed that the tracking of the critical contour can be achieved with a reasonable number of long-running simulations of a few tens.


Introduction
Some recent storm events like Katrina in 2005 or Xynthia in 2010 (see e.g. [1]) illustrate the present-day coastal damages and injuries that can affect the coastal area, both in cyclonic and non-cyclonic environment. Katrina was one of the six most powerful hurricanes ever registered in the Atlantic, leading to 1836 deaths and damages of about 80 billion USD, whereas Xynthia was a mid-latitude storm that severely hit low-lying coasts located in the central part of the Bay of Biscay on the 27± 28 February 2010, inducing 53 deaths and material damages estimated at more than one billion euros. From a statistical point of view, the wave heights generated during the Xynthia event could not be considered as H[WUHPHV EXW ZKDW PDNHV WKLV HYHQW ³UDUH´ LV WKH combination of a high spring tide with a large storm surge (enhanced by young wind waves) reaching its maximum around the tide peak. This frontier represents the boundary between the ³safe´ and ³unsafe´ regions (greycoloured area), i.e. the boundary of the set of offshore forcing conditions, which lead to an inundation at the coast. Adapted from [3] As discussed by Idier et al. [2], it can be of high interest to identify the combination of all critical set of offshore conditions that lead to inundation on key assets for the studied territory (e.g., assembly points, evacuation routes, hospitals, etc.), i.e. to track the critical frontier ī C , ZKLFK VHSDUDWHV WKH ³VDIH´ UHJLRQ IURP WKH ³XQVDIH´ RQH in the offshore conditions domain (e.g. combination of wave, tide and surge conditions): this is schematically depicted in Fig. 1 in 2D. It should be underlined that the number of offshore conditions could be larger (>10), depending on considered physical parameters (tide, atmospheric storm surge, wave height, wave direction, wave period) and on the way time and/or spatial offshore conditions are described. Tracking such a critical frontier is the key element of an inverse methodology of coastal risk assessment. The main idea of such an inverse risk method is the inversion of the usual risk assessment steps [2]: starting from the maximum acceptable hazard level (defined by stakeholders as the one leading to the maximum tolerable consequences) to finally obtain the return period of this threshold. Such an "inverse" approach would allow the identification of all the offshore forcing conditions (and their occurrence probability) inducing a threat for critical assets of the territory. The benefits are multiple, whether for: (1) estimating the probability of exceeding the maximum tolerable inundation height for identified critical assets, or for (2) providing critical offshore conditions for flooding in early warning systems, and for (3) raising awareness of stakeholders and eventually enhance preparedness for future flooding events by allowing them to assess risk to their territory.
However, the practicability of such an inverse risk method can be hindered, because tracking the boundary of the set of critical conditions (critical frontier) should rely on an accurate description and modelling of coastal processes: this requires the use of full-process based models for coastal flooding simulations, which might have very large computational time cost (typically of several hours per simulation). Such a computation burden often limits the analysis to a few scenarios, hence might prevent the estimation of the critical frontier. Recently, it has been shown that meta-modelling approaches can efficiently handle this difficulty (e.g., [3]). The basic idea is to replace the long-running code by an approximation constructed using only a limited number of different simulation scenarios. Rohmer & Idier [3] further extended this approach by combining the meta-model with an adaptive sampling procedure aiming at improving the local accuracy in the regions of the offshore conditions that contribute the most to the estimate of the targeted frontier.
Though the afore-described strategy proved to be very efficient (achieving a reduction by a factor 20-40 of the total number of necessary long-running simulations in the application case of [3]), it still faces a strong limitation related to the nature of the full-process based models: they are expected to present strong non-linearities (nonregularities) or shocks (discontinuities), i.e. dynamics controlled by thresholds. For instance, in case of coastal defence, the dynamics of the waterline position is characterized first by a linear behaviour (increase with increasing offshore conditions), as long as there is no overtopping, and then by a very strong increase (as soon as the offshore conditions are energetic enough to lead to wave overtopping, and then overflow). Such behaviour might make the training phase of the meta-model very tedious (see e.g., [4]).
In the present study, we propose to rely on advanced machine learning techniques to overcome the afore-described challenge related to non-regularity. We focus on the Support Vector Machines SVM (e.g., [5]). A key aspect is to optimise as much as possible the number of required simulations for estimating the critical frontier: this can be performed by relying on active learning techniques, i.e. on statistical techniques to guide and select the simulations to be run in order to predict with high accuracy the boundary of the set of critical conditions. The objective of this communication is to show the feasibility of such an approach.
In a first section, we describe the case study, which motivated the present work. In a second section, we further describe the statistical methods used to track the critical set of offshore conditions. In section 4, we apply them and discuss the results.

Case study 2.1 Description
The application case relies on a cross-shore configuration, as depicted LQ )LJ DQG using the process-based SWASH model [6]  As shown in Figure 3, the relationship between the run-up R and both offshore conditions is highly nonlinear: a region of the SuH S domain does not lead to flooding, i.e. with R close to its minimum value (around 0) as highlighted by the blue-coloured region in the left bottom hand corner of Fig. 3. When both offshore conditions exceed given thresholds (which are unknown before actually running the code), overtopping occurs and the basin rapidly fills with sea water so that R steeply increases from ~0 to ~90% over a narrow region of (6; H S ) and then gently increases from ~90% to 100 % over a large region of the 6uH S domain (red coloured region in the right hand corner in Fig. 3). The objective is to estimate the black contour as depicted in Fig. 3B