Exploratory Research of New Curve Number System

. In the past, the CN was determined through SCS handbook. In order to determine runoff prediction using SCS-CN model, selection of CN is important. However, the conventional CN methodology with inappropriate CN selection often produces inconsistent runoff estimation. Thus, the new direct curve number derivation technique based on rainfall-runoff datasets with supervised numerical optimization technique under the guide of inferential statistics was developed to improve the accuracy of surface runoff prediction. Furthermore, the two decimal point CN system was proposed in this study. The optimum CN of Melana site is 90.45 at alpha 0.01 with BCa 99 % confidence interval range from 90.45 to 95.12. The regional specific calibrated SCS-CN model with two decimal point CN derivation technique is out-performed the runoff prediction of conventional SCS-CN model and the asymptotic curve number fitting method.


Introduction
Recent year, due to rapid urbanization, the flooding scenarios occur more frequently in main city of Malaysia thus the rainfall-runoff model plays an important role in planning and managing the water resource system and flood control. There are various types of rainfallrunoff models available. Since 1954, the Soil Conservation Service Curve Number (SCS-CN) is one of the most popular rainfall-runoff model to predict direct runoff amount from agricultural site and then extended to urban watershed due to its simplicity. Nowadays, the SCS-CN model is incorporated into various software such as ArcGIS [15,54,55], remote sensing [15,29,35] and SWAT [24,41]. The SCS-CN method also adopted by Malaysian government agencies and used as teaching purpose in engineering hydrology textbook.
Unfortunately, the accuracy and consistency of surface runoff prediction by using the curve number method derived from SCS had been questioned by some hydrologist from various countries [1, 8-9, 13, 16-18, 25, 27, 30, 33, 37-39, 50, 53, 56]. There was a study proved that by using tabulated CN values had a tendency to over-predict runoff amount [37]. Furthermore, a study in South Korea concluded that by referring the tabulated CN unable obtained satisfied runoff estimation [24]. The tabulated CN groups the effect of land use and land cover condition, hydrologic soil group (HSG), and antecedent runoff condition (ARC) based on watershed characteristics into a single coefficient [3,[42][43][44][45]51]. As a result, CN is the most important parameter in SCS-CN model. Wrong selection of CN will lead to inappropriate runoff prediction [17]. In 1954, the SCS-CN model was used CN handbook to search the suitable CN value to represent the watershed. The CN value was then substituted into the CN formula as shown in Eq. 1 to find the maximum potential water retention of watershed (S). By fixing λ=0.2 with S value and event rainfall (P) substituted into the base SCS-CN model as shown in Eq. 2, the surface runoff (Q) could be determined.
Where, CN = Curve Number Value Error! Reference source not found. = Runoff depth (mm) P = Rainfall depth (mm) Ia = the initial abstraction (mm) S = maximum potential water retention of a watershed (mm) Basically, the CN is dimensionless watershed index which ranged from 0 to 100 to represent a watershed from high infiltration to fully permeable respectively. All the tabulate CN and charts of CN available in the Soil Conservation Service (SCS) National Engineering Handbook Section 4: Hydrology (NEH-4) [42][43][44][45] for agricultural areas, as well as Technical Release 55 (TR-55) [51] and applied in Technical Release 20 (TR-20) software for small and big urban catchments respectively were derived based on approximately 199 experimental watersheds in United States of America (USA) that ranged in size from 0.0971 ha to 18,600 ha which were located at 23 locations nationwide, using measurements of annual maximum rainfall and runoff collected between 1928 and 1954 and thousands of infiltrometer tests [49]. However, the information about the initial development of the curve number method has not been preserved [19,50,51].
The NRCS handbook [42][43][44][45] was created based on the hydrological condition and the soil group type in United States of America (USA) thus the soil condition and hydrological situation in Malaysia might not be the same as USA. There is a possibility to obtain unsuitable soil group type for a particulate watershed in Malaysia and thus produces unrealistic runoff volume. As a result, the SCS-CN practitioners should not blindly adopted the CN values in NRCS handbook [42][43][44][45], TR-55 [51] and TR-20 software. Although the SCS-CN model had integrated with remote sensing (RS) and geological information sensing (GIS) to produce a high resolution remote sensing imagery, the number of land cover types described in NEH-4 [42][43][44][45], TR-55 [51] and applied in TR-20 software was still so enormous that it was hard to classify into appropriate CN categories accurately [6]. As a result, Hawkins (1998) and Canters et al. (2006) [4,20] suggested that CN tables should only be used as a guideline, and the actual CN should be determined based on local and regional data. Many researchers developed the CN calculation methods by incorporating the land cover information and the original CN in TR-55 [14] but the CN still unable to represent the regional specific watershed. In order to improve the surface runoff prediction result, most of the SCS practitioners practiced a common approach by "trial and error" to tweak the CN based on observed data without any statistical justification. By using "trial and error" method cannot obtain a consistent CN to represent particular watershed. Recent decades, most of the research studies used the recorded daily rainfall-runoff (P-Q) data pairs from local or nearby watershed to derive the local CN value [17,21,22,[46][47]. There was few approaches for CN determination from observed P-Q data had been reported such as least-squares method (LSM) [50] and asymptotic fitting method (AFM) [17,19]. The asymptotic fitting method created by Professor Hawkins was based on frequency matching concept. The rainfall-runoff data were sorted separately in descending order. The asymptotic fitting method basically classified into three different response behavior which was standard behavior, complacent behavior and violent behavior. For complacent behavior, there was no constant CN can be obtained. The asymptotic fitting method had been applied in some earlier research studies [2,8,34,40,49] In 2014, there was a study in Sicily by D'Asaro et al (2012) [8] discovered that the CN values found by the asymptotic fitting and least-squares fitting methods were all lower than the tabulated CN in handbook table [8]. In summary, there are two issues to be addressed in this article as stated below: Issue 1: Limitation of NRCS handbook in choosing an exact curve number. Issue 2: The regional specific curve number should be rounded or not.
As a result, in this study will present a new approach to derive regional specific CN based on direct P-Q data sets under the guide of non-parametric inferential statistics and a new CN system with at least two decimal point is proposed to solve all issues above.

Data and methodology 2.1 Study site
In this study, Melana watershed is chosen to demonstrate the regional specific urban SCS model and new CN system. Melana watershed is located at Johor, Malaysia between 1°30'N to 1°35'N and 103°35'E to 103°39'E as shown in Fig 3 [5]. The total area of whole Melana watershed is 21.12 km². There are total of twenty-seven data sets of rainfall-runoff event between July 2004 to October 2004 were adopted from Chan (2005) [5] as shown in Table  1 [5]. Due to rapid urbanization in Melana watershed, there was only 20 % of Melana watershed was urbanized and after seven years later, more than 60 % of the area would be residential area [32]. As a result, in order to prevent the rainfall-runoff data sets affected by land cover land use change in Melana watershed due to rapid development, there was only a short period of rainfall-runoff data pairs were used in this study.

Methodology
In this study, the non-parametric inferential statistics Bootstrapping technique, Bias corrected and accelerated (BCa) procedure with 2,000 sampling [11][12] was conducted by the help of supervised numerical optimisation technique to calibrate the base SCS-CN model and derived the regional specific CN for Melana watershed through the P-Q datasets directly. The Bootstrapping BCa statistics was chosen due to its robustness nature and the inferential ability via its confidence interval [7,10,52]. At the same study site, the Null hypothesis had been set up and all had been rejected in the previous research study in 2015 and 2017 by Ling [26][27][28] and also concluded the lambda value should not equal to 0.2. Furthermore, the 99 % confident interval (CI) range for λ and S were found out in previous publication too. In this research results having the same conclusion as previous research study [26][27][28], the lambda value fixed at 0.2 being rejected due to 0.2 was not in the λ confident interval range of (0.0004, 0.0005). The regional specific equivalent CN0.2 value for Melana site could be derived by using CN formula (CN=25,400/ (S+254)) which was proposed by SCS for CN comparison [19,23]. The Sλ needed to correlate using general Sλ formula derived by Ling (2017) [27] as shown in Eq. 3 to substitute back into the CN formula. ( Throughout this study, the derived CN value and new CN system will be created for Melana watershed.

Runoff Model Assessment
There are three types runoff model assessment with formulas as stated below:  There are two issues had been proposed to solve as stated in introduction. First of all, the CN chosen from tabulated handbook is not sensitive due to a big gap in between CN range. The exact CN cannot pick correctly by referring to the NEH-4 handbook [42][43][44][45]. However, the accurate regional specific CN for Melana watershed can be calculated by using Eq.  45, 95.12). According to NEH-4 handbook [42][43][44][45], the CN of urban district with hydrological soil group A is 89 which is out of the obtained confident interval range of Melana watershed. Therefore, only soil group B, C and D are statistically significant to be considered for Melana watershed. Due to the climate change scenario and rapid urbanization posts a challenge to identify a suitable CN for the Melana watershed via tabulated handbook. Therefore, by using NRCS handbook [42][43][44][45] to select CN has a risk of committing Type II error. Based on the graph in Figure 4, it clearly shows that there is uncertainty in runoff volume when the precipitation is changing. When the precipitation is less than 12 mm, the CN selected from handbook under urban district for Melana site has the tendency to over-predict runoff amount up to 35 thousands m³ and under-predict up to 15 thousands m³. When the rainfall amount is more than 12 mm, all the runoff volume for CN=84, 92, 94, and 95 will over-predict runoff amount up to 500 thousands m³. Moreover, Fig. 4 clearly shown that when the CN increases from 85 to 95, the runoff over prediction risk is significant and further magnified toward higher rainfall depth. As a result, by using NRCS handbook [42][43][44][45] having inconsistent runoff prediction.
There are some limitations of NEH-4 handbook [42][43][44][45]. Often in the tabulated CN handbook, there will be sudden jumps in CN choice based on four different hydrologic soil groups as shown in Fig 1. Moreover, by referring to the chart in handbook as stated in Fig  2, the selection of CN is in inches instead of SI unit and with a condition that λ value in SCS-CN model must fixed at 0.2. CN was proposed as whole number and still in use until today. However, rounding up the CN will cause a huge differences in runoff volume even it is just different by one CN class. This proves that CN variation affects the direct runoff estimation more than rainfall variability. Figure 5 shows the graph of runoff volumetric difference when the calculated CN=90.45 has been round up to 91 and round down to 90. It is obviously showed that the CN should not be round up or round down to whole number CN. When CN=91, the runoff volume tends to over-predicted up to 300 thousand m³. The maximum over-predicted runoff volume is 255 thousands m³ when the CN is 90. Rounding the CN causes a large amount of uncertainty runoff volumetric difference in Melana watershed.  In this study, the new CN system approach with at least two decimal point being proposed instead of one decimal point. This is because there is still over-predicted nearly 273 thousand m³ volumetric difference at rainfall depth of 61.6 mm under CN=90.40 as shown in Fig. 7 whereas the volumetric difference for two decimal point CN=90.44 is only 270 thousand m³ as shown in Fig. 8. When the rainfall amount increase, the runoff volume differences also increase. The volumetric difference for two decimal point CN is smaller than one decimal point CN. Thus, the CN should be proposed as at least two decimal point. Nowadays, most of the research studies used asymptotic CN fitting method (AFM) to find CN∞ in order to avoid inconsistent runoff prediction based on tabulated CN handbook. The CN∞ of Melana site is derived as 81.72 based on a previous research study in 2017 [26]. The accuracy of asymptotic CN runoff model is benchmark to conventional SCS-CN model and calibrated SCS-CN model by using few predictive model as tabulated in Table 2. Based on Table 2, it obviously stated that the calibrated model has higher Nash-Sucliffe, lower RSS and BIAS almost near to zero. Furthermore, the CN∞ value determined by AFM model did not has statistical significant proof at alpha 0.01. Table 2. Assessment of Different Runoff Predictive Model. Fig.9 shows the graph of three runoff predictive model against the observe runoff amount versus precipitation in Melana watershed. By using the calibrated SCS model, the obtained Ia=0.043mm. The minimum observed rainfall amount is 0.054mm. This showed that the precipitate is more than initial abstraction, thus there is runoff generate. In the other hand, by applying the conventional SCS-CN model, the calculated Ia is 2.971mm. According to SCS, any rainfall amount less than Ia then no runoff generate. However, 1/3 of the rainfall events already violated the SCS constraint. Based on Fig.9 below, when the precipitation is less than 0.5 mm, the runoff amount cannot be more than 0.5 mm rainfall depth while the predicted runoff depth from conventional SCS-CN and AFM model are higher than 0.5 mm rainfall depth. There is a huge different between the predicted runoff against the observed runoff depth when the precipitation reaches 61.6 mm. The SCS-CN model tends to over-predicted up to 326 thousand m³ runoff (equivalent to 130 Olympic size swimming pools) and under-predict up to 6 thousand m³ as compared to calibrated SCS-CN model. In the other hand, the AFM model tends to over-predicted up to 69 thousand m³ runoff (equivalent to 28 Olympic size swimming pools) and under-predict nearly 62 thousand m³ (equivalent to 25 Olympic size swimming pools).

Conclusion
Conclusion, the CN should not select from the NRCS handbook. By using the new CN derivation approach with supervised numerical optimization technique under the guide of non-parametric inferential statistics can overcome the difficulty of selecting an accurate CN from tabulated handbook. Throughout this study, the regional specific calibrated runoff predicted model performs better than SCS-CN model and asymptotic curve number fitting method. The derived CN of Melana watershed is 90.45 under statistical significant at alpha 0.01 level. The predicted runoff results from conventional SCS-CN model tend to 326 thousand m³ which is equivalent to 130 Olympic size swimming pools and under-predict up to 6 thousand m³. Lastly, the new two decimal point CN system should be applied instead of rounding up the CN. Although the CN value derived from asymptotic CN method is two decimal point of CN value, the new CN derivation approach still performs better with the proof of higher R value, lower BIAS and lesser RSS are obtained. The CN value determined by asymptotic CN method for Melana site is 81.72 which has tendency over-predicted up to 69 thousand m³ runoff (equivalent to 28 Olympic size swimming pools) and under-predict nearly 62 thousand m³ (equivalent to 25 Olympic size swimming pools).