Exploration of HVAC system sizing based on building performance simulation and Monte Carlo method

. This study uses the Monte Carlo method and building performance simulations to develop an additive model for rapid peak load forecasting at design phase that considers the effects of design parameters. The Monte Carlo method generates numerous of simulation cases and EnergyPlus software is used for the calculations. Specifically, a total of 20 parameters were considered for analysing the peak load calculations, including design day conditions, envelope performance, infiltration, etc. An office building was selected as the reference building. With the screening experiments and the standard regression coefficient, it was identified that there are 15 important parameters for peak cooling load in the perimeter zones and 7 in the core zone. Main effects and interactions for selected parameters were determined by factorial experiments of 40,000 runs for the perimeter zone and 1,287 runs for the core zone. Main effects and interactions were used to develop an additive model between design parameters and peak cooling loads. Finally, model validation by additional 1,000 cases shows a coefficient of determination of 0.995, with a mean bias error of 3.2%, and a coefficient of variation of 3.7%, which indicated that the developed additive model had high accuracy


Introduction
As the first step in HVAC system design, accurate peak cooling load calculations are critical to reducing building energy consumption and ensuring system operating efficiency. Chiu et al indicated that relying on oversized by up to 100%. Right-sized equipment reduces the first cost, extends the life span of equipment and future replacement costs, reduces energy use, and improves occupants' comfort condition [3]. Over the past decade, several studies have conducted the uncertainty and sensitivity analysis to identify the influences of building parameters on heating and cooling loads and energy consumption. Previous studies tended to investigate the annual loads or energy consumption, while not often focusing on peak loads. Zhu et al [4] took an office building as an example and use the Monte Carlo method based on building performance simulation to solve the problem of building load forecasting at the planning stage, and concluded that the key factors affecting peak cooling load, annual cooling demand, peak heating load, and annual heating demand are different, including two aspects of importance ranking and positive or negative correlation.
To determine the impact of uncertainty on peak loads, Fernando et al [5] proposed an alternative framework that can assess the risks associated with each design decision, rather than simply presenting peak loads based on a single-case simulation. Sun et al [6] explored the use of uncertainty analysis and sensitivity analysis in HVAC system sizing to replace the safety factor with quantified margins.
To avoid oversizing, the green building certification system (GBCS) in Taiwan [7] requires HVAC systems with proper capacity as a prerequisite. The area index method is adopted in Taiwan's GBCS to estimate the peak cooling loads and confirm the appropriateness of system capacity. However, the influences of building physics and internal loads on peak cooling loads are ignored in the area method. Previous mentioned studies have shown that building design parameters have an important impact on the uncertainty of peak cooling loads. Therefore, this study proposed a simple and accurate model based on the Monte Carlo method and building performance simulation that design parameters are considered to solve the issue of peak load forecasting in the early design stage. The outcomes of this study are expected to assist the Taiwan GBCS to rethink and improve the evaluation method for properly sizing a HVAC system.

Methodology
In line with the research objectives, the basic idea of the framework shown in Figure 1 can be divided into three stages: design parameters, peak load simulation, and the establishment of predictive model.
At the first stage, this study selected the design parameters that were frequently taken into consideration by the HVAC engineers when calculating the peak load, such as location, design day conditions, building envelope variables, air leakage and ventilation rates, internal heat gains, and temperature set points. Table 1 provides the 20 parameters and their ranges of variation considered in this study.
As a reference building model, a middle floor of a generic air-conditioned commercial building with a plan view of 40m×40m is analyzed. As shown in Figure 2(a), the study space is divided into four perimeter zones: North, East, West, South, and a core zone. ASHRAE Handbook of Fundamentals [8] was used to identify the design day conditions and building materials.
Regarding the second stage, the distribution of the predicted peak cooling load was generated using Monte Carlo simulations that considered the uncertainties in weather, building physics, and internal heat gains. EnergyPlus software was used to perform modeling and simulation in this study; besides, an input file named Building Input Data File (IDF) was used for simulation. Latin hypercube sampling (LHS) with uniform probability distribution was used to select different levels of each parameter and replace them in each IDF. For each combination of parameters, a new IDF was generated, and an EnergyPlus simulation is performed for each IDF.
At the third stage, based on the results from Monte Carlo simulations, an additive model was developed to sequentially. An experiment was performed to rank important parameters by standardized regression coefficients (SRC) and to filter out parameters that did not reach a statistically significant effect. Regarding the experiment design, the floor shape of the building model is a rectangle (Figure 2(a)). As shown in Table 1, 19 parameters were selected for the screening experiment. Theoretically, the peak cooling load in the core zone is independent of the envelope parameters and therefore limited to 8 parameters related to ventilation rate and internal heat gains. To reduce the number of simulations, each parameter is set for two levels, a maximum value, and a minimum value. 250 random combinations of design parameters generated by LHS were used to perform the peak load calculations. Based on the simulation results of one core and four perimeter zones, only their main effects were considered whereas the interaction was ignored when conducting the statistical analysis of the significance of parameters related to cooling load.
Screening experiments were performed to rank the important parameters by SRC and to filter out the parameters that did not reach a statistically significant effect to reduce the number of factorial experiments. The SRC of each input parameter can be calculated according to Equations (1)-(4). The absolute value of SRC indicates the relative importance of the input parameters, while the positive and negative values indicate the direction changes. Specifically, the factorial experiments was applied in the second stage for testing and analysing the interaction between factors, which represents the influence of one factor varies with other factors. A three-level factorial experiment, including the level of maximum, median, and minimum (Table 1), was used in this study because it was noted that there may be a non-linear relationship between the peak cooling load and the parameters. One exception is that the azimuth uses four levels, namely north, east, west, and south. The number of parameters considered for the core zone is 7 (see section 3.1 for details), and the total number of simulations runs for the full factorial design of the experiment is 3 7 = 2,187, which is manageable with the computer ability nowadays. For the perimeter zone, with the number parameter increased to 15(see section 3.1 for details), and the number of runs (4×3 14 ) required for the full factorial design of experiments was beyond the ability to conduct the experiments and to analyze the results; thus, the fractional factorial experiments was applied to replace the former method. A massive number of simulations increases the precision of the models [9]. 40,000 random combinations, which is only 0.2% of the number of runs required for the full factorial experiment, were used to perform the simulation of peak cooling load. Accurate peak cooling load calculation prevents oversizing of the HVAC systems which leads to the increasing of capital costs, energy inefficiencies, and increased operating costs in buildings. Taiwan GBC also allows the use of dynamic simulation software to provide an accurate estimation of peak cooling loads based on the detailed description of the building, but it is time-consuming and inconvenient for users. Therefore, it is necessary to propose a simple calculation method to determine the peak cooling load which considers the uncertainties caused by selected parameters. An additive model is an approach often utilized in design experiments to predict outputs under specific input conditions with a simple arithmetic computation [10]. Thus, the additive model is applied in this study to represent the relationship between peak cooling loads obtained from EnergyPlus simulations and the selected factors. In this study, the main and the interaction effect of the design factors on the output comprises the additive model between the design parameters and the peak cooling load, as shown in Equation (5). The main effect values of each design parameter at a single level are shown in Equation (6). The interaction value between the two design parameters and total peak load simulation results are shown in Equation (7).
Model validation, demonstrating the accuracy and feasibility of a predictive model, is an important step in developing a model. Validating the model with a dataset different from the one used to create the model, values other than levels, and considering the parameters excluded from the model can increase the confidence of the model. For this reason, an additional 1,000 combinations for model validation were created by the Monte Carlo method based on the building model shown in Figure 2(b). The 10 m 2 non-air-conditioned zone in the upper left corner of Fig. 2(b) is designed to make the four perimeter zones have different floor areas. The 20 parameters listed in Table 1 include aspect ratio parameters only used for Model validation, ranging from 0.5 to 2.0. Each design parameter is allowed to vary continuously between the maximum and minimum values.
To ensure that the additive model developed produces reasonable results, the peak total cooling loads calculated through EnergyPlus simulation and the  4 The window area ratio is the ratio of the window area to the floor area of the perimeter zones 5 Parameters used in model validation only additive model for these 1000 cases were compared. The quality of the developed model was evaluated by mean bias error (MBE), coefficient of variation (CV), and coefficient of determination (R 2 ), as shown in Equations (8)- (10). If the developed model has low MBE and CV and high R 2 , it proves that the model is sufficiently accurate and therefore can be used for predicting the peak cooling load.
3 Results and discussions 3.1 Analysis of screening results Figure 3 shows the possible range of peak cooling loads in the core and four perimeter zones of the 250 cases in the screening experiment in box-and-whisker plots.
Comparing to the core zone, apart from ventilation and internal heat gain, factors that affect cooling load in the perimeter zone also include the conduction heat gain from exterior walls and solar heat gain from windows. Unsurprisingly, the peak loads of the perimeter zones are generally in wider ranges of distribution and higher values than those of the core zone. The peak loads of the core zone ranged from 40 to 119 W/m 2 , the interquartile range (IQR) was 32 W/m 2 and the average was 76 W/m 2 . Among the peak loads of four perimeter zones, there are wider range of distributions and IQR in the east and west zones, as well as higher averages. Specifically, the distribution ranges of the north, east, west, and south zones are respectively 56~158W/m 2 , 60~201W/m 2 , 67~244W/m 2 , and 53~152W/m 2 , with the IQR of 31W/m 2 , 39W/m 2 , 44W/m 2 , and 30W/m 2 , and the mean of 101W/m 2 , 125W/m 2 , 139W/m 2 , and 99W/m 2 . The SRCs of all parameters considered in this study, Fig. 3. Box-and-whisker plots of peak cooling loads from screening experiment sorted by the magnitude of their absolute values, are shown in Table 2. As all parameters are independent, the parameter with a larger absolute value of SRC represent the greater influence. Furthermore, when the SRC of a parameter is positive, it means that the parameter has a positive correlation with the peak cooling load, and a negative for the negative correlation. It can be seen from Table 2 that the "VR" is the parameter that has the greatest impact on peak cooling load in the core and perimeter zones, followed by four parameters related to the solar heat gain of the window: WR, OR, OPR, SHGC, followed by parameters related to internal heat gains: EPD, LPD, and OD. When stepwise regression is applied to a linear fit of all parameters linear to peak cooling load, i.e Equation (1), it is possible to distinguish whether parameters are statistically significant or not. Based on the criterion of p-value > 0.01, IM, TMW, AW, and SL were regarded as the insignificant parameters, which the absolute value of the SRC of these excluded factors is less than 0.05. Ultimately, the important parameters are those marked with " superscript 1 or 2 " in Table 2, of which there are 15 for the perimeter zone and 7 for the core zone. Figure 4 shows the peak cooling load distributions of the 10,000 cases in each of the four perimeter zones and 2,187 cases for the core zone, which are approximately normal distributions. The distribution range of peak cooling load in each region obtained by the factor experiment is consistent with the range of the screening experiment shown in Figure 3. The reference values of the peak cooling load for each zone in the Taiwan's GBCS are marked with an asterisk in Figure 4, which are respectively 184.1 W/m 2 for the north zone, 252.9 W/m 2 for the south zone, 298.0 W/m 2 for the east or west zone, and 141.8 W/m 2 for the core zone. It can be seen that those reference values are obviously higher than the maximum value of the possible range obtained from this study.

Results of factorial experiments
According to the results mentioned above, owing to the uncertainty in peak cooling loads caused by different design solutions was not considered in the existed evaluation method, it is regarded that the fixed values provided by Taiwan's GBCS are not suitable as a basis for accurately judging whether the HVAC system is oversized.

Main effects and interaction effects
In this study, the main and interaction effect of the design factors on the output comprises the additive model between design parameters and peak cooling loads (Equation (5)). Besides, only the interactions that can be explained physically and are pointed out by the literature [11,12] to be important are selected. Thus, the interaction effect between the parameters related to window solar heat gains, namely WR and SHGC, WR and OPR, and SHGC and OPR, are considered in Equation (5). Once the form of the model has been chosen, the next step is to determine its means, main effects, and interactions.
The averages of peak loads for the experiment designs in the north, east, west, south, and core zones were respectively 96.0, 119.5, 134.1, 94.4, and 72.6 W/m 2 . Figure 5 shows the average peak cooling load for each parameter at the level which was considered. For most parameters, the peak cooling load appears to vary linearly with the input for each parameter. The steeper the slope, the greater the main effect of the parameter on the peak cooling load. Specifically, "VR" was found to be the most important design parameter for peak cooling load, followed by WR, OPR, and SHGC. The slopes for WR, OPR, and SHGC in the north or south zone are much flatter than in the east or west zone. The main Finally, this study avoided overfitting by applying stepwise regression to check whether Equation (5) pools statistically insignificant main effects and interactions. The p-value significance test results show that the main effects of UW, UG, and SH hit a significant level of p<0.01, and the other main effects and interaction effects hit a significant level of p<0.001. This results indicated that all selected main effects and interactions should be retained in the additive model of Equation (5). Table 3 lists the main effects of selected parameters based on Equation (6), Table 4 and Table 5 list the interaction effects according to Equation (7).

Model validation
The main effect and interaction effect of a specific verification case is determined by the interpolation method in Table 3 to Table 5, and then substituted into Equation (5) to obtain the peak cooling load per unit area of each area. Once the peak cooling load per floor area for the individual zone is known, the total peak cooling load of the case can be obtained by the area method. Figure 6 illustrates the predicted peak cooling loads against their simulated values. As it can be seen, there is a good agreement between simulated and predicted data. In addition, the obtained values of MBE, CV, and R 2 show good accuracy, which are respectively 3.2%, 3.7%, and 0.995. The results obtained by the model validation indicated that the additive model developed in this study

Conclusion
The maximum allowable peak cooling load benchmark for Taiwan's GBCS is significantly higher than the maximum possible range obtained by a large number of simulations. The distribution of peak cooling loads for the 40,000 perimeter zones generated by fractional factorial experiments approximates a normal distribution. The peak cooling loads of the four perimeter zones ranged between 51.1W/m 2 and 153.2 W/m 2 in the north zone, between 57.8~217.7W/m 2 in the east zones, between 64.6~235.4 W/m 2 in the west zone, and between 48.9~150.3 W/m 2 in the south zones. The peak cooling loads for the 2,187 core zones produced by the full factorial experiment ranged from 37.6 to 110.8 W/m 2 . Parameters with an absolute value of SRC greater than 0.05 were regarded to have a significant effect on the peak cooling load; accordingly, 15 important factors for the perimeter zone and 7 for the core zone were selected. The ventilation rate has the greatest influence on the peak cooling load, followed by four parameters related to the window solar heat gain: window area ratio, orientation, Overhang projection ratio, glass SHGC, and the parameters related to internal heat gains: Equipment power density, Lighting power density, and occupancy density. Internal mass, thermal mass of exterior wall, absorptivity of wall, and location are regarded as the insignificant parameters.
According to the main effect analysis of the selected 15 or 7 parameters, except for the main effects of the Uvalue of wall and glass, and story height which reached a statistically significant level of p<0.01, the main effects of other factors all reached a very significant level of p<0.001. This study considered the interaction between window area ratio with SHGC of glass, window area ratio with overhang projection ratio, SHGC of glass with overhang projection ratio, these three interactions reached a very significant level of p<0.001. The developed additive model was validated on additional 1,000 cases different from the dataset used to develop the model. When comparing the predicted peak cooling load of the additive model with the predicted peak cooling load of the EnergyPlus simulation, good fitting results with MBE=3.2%, CV=3.7%, and R 2 =0.995 were obtained, which indicated that the developed additive model has high accuracy.
The model presented in this paper is expected to improve the current mechanism for HVAC oversize issue in Taiwan's GBCS, and ultimately realize the goal of energy conservation.