Metamodeling of building energy consumption focused on climate, operation, space use and users related factors

Energy performance guarantee projects aim at achieving a given energy consumption in real life conditions. Building energy consumption monitoring during operation phase often reveals that energy consumption is sensitive to building spaces use and systems operation quality, especially for buildings with high energy performance characteristics [7]. Other investigations show the impact of building users’ behaviour on energy consumption [28]. These factors must be added to climate factors for energy consumption prediction during operation phase. Number of factors and possible combinations is very high. Building energy modeling is limited regarding this issue and metamodeling has been used to solve this problem [25]. We developed metamodels that are polynomial functions using D-optimal design of experiment (DOE) approach. Such metamodels can become operational tools to use in the IPMVP framework, associated with a M&V plan. This paper shows the application of the method on a cultural building that comprises numerous systems and usages. We obtain a reliable metamodel of the energy consumption as a function of climate, operation, and space use factors. which meets IPMVP [11] and ASHRAE Guideline 14 [3] modeling uncertainties criteria. We also determine the global uncertainty resulting from predictors’ uncertainties propagation and modelling uncertainty associated with the metamodel.


Introduction
The building sector is one of the largest greenhouse gas emission emitters at the global scale. Therefore, building sector energy demand reduction is critical. In France, the national building energy code, named "RT", has been implemented and have become gradually more stringent, allowing for the improvement of building envelope and HVAC systems energy performance. However, the RT code evaluates a theoretical energy performance only. Moreover, this evaluation includes only a fraction of total energy end-uses. Since building envelope and HVAC systems characteristics regularly improve, the weight of non-regulatory energy end-uses increases. These energy end-uses are typically associated with building users' activities. In addition, high performance buildings show new issues related to HVAC systems operations and impact of users' behaviour ([7], [28]). Because of these trends, associated with the on-going increase of the building stock size, the actual energy consumption of the building sector in France has not decreased so much ( [1], [16]). Energy consumption during operation phase often exceeds design phase's calculated energy consumption target [22]. Energy performance guarantee is an efficient tool to manage real life energy consumption. It aims at achieving real-life energy consumption target. It is used to quantify and manage the risk associated with energy consumption deviation. Modeling these risks is done with the baseline adjustment method. Adjustment is typically considered for heating/cooling degree-days. Adjustment models solely based on degree-days provide easy and fast calculations but are not accurate enough and do not consider other factors that may be important [18]. Energy performance guarantee may be limited in cases where a building's use changes have a too strong impact on energy consumptions, meaning when adjustments cause such variations that the energy performance contract relevance and economical balance are compromised. According to the author, this is a current limit to energy management since dedicated risk management tool cannot address such situation. In general, the factors that impact energy consumption belong to the following categories: climate, envelope, activities (occupancy, process), lighting, HVAC controls, water pipes and air ducts length, fresh air treatment, terminal units, domestic hot water (DHW), production and distribution efficiencies [20]. Some of these categories are addressed during design and construction phases. Others are operation phase variables: climate, activities (occupancy and process), lighting and HVAC controls. Building energy performance simulation (BEPS) can be used to calculate the impact of these factors on energy consumption. However, the number of factors and possible combinations is high. Using BEPS would require thousands of simulations to carry out sensitivity analysis and uncertainties propagation, which is an obstacle ( [6], [23]). Metamodeling has been used to address this issue since metamodels can be a good compromise between accuracy and calculation speed [17]. Design of experiment (DOE) approach has been used to create metamodels ( [6], [18], [25]). Among the available design of experiment approaches, D-optimal DOE are adapted to create polynomial functions that predict building energy consumption [25]. We propose a two-step method. Firstly, we developed polynomial energy models that can predict energy consumption as a function of building's activities characteristics and HVAC systems operations factors. To achieve this, we used EnergyPlus software in order to build reliable energy models along with the design of experiments method (DOE) to build a metamodel associated with a modeling error. In parallel, we analyzed several feedbacks to build a database of building operation phase factors that are potential energy consumption predictors. We also defined how to determine the appropriate structure of the polynomial model to reach a good compromise between model accuracy and calculation time. Secondly, we used measurement and verification (M&V) data, associated with probability functions, to determine the associated uncertainty of the calculated energy consumption. Finally, we combine the latter with the polynomial modeling error to calculate the energy consumption global uncertainty, with the goal to identify strategies to reduce it.

Methodology
The methodology we have developed aimed at reducing calculation time, identifying significant predictors, linking the model's structure and the M&V plan and being able to propagate predictors' value uncertainties through the model.

Global uncertainty calculation
The goal is to reduce the global uncertainty on energy consumption during operation phase. Global uncertainty is defined as the sum of modelling, measurement and sampling uncertainties [11], and is expressed by the following equation [12]: Where SE is the global standard error, SE(modeling) is the modeling standard error and SE(measurement) is the measurement standard error. Modeling error is well defined in the literature, using the root mean square error (RMSE), the mean bias error MBE and their coefficients of variation, expressed by: IPMVP [11] and ASHRAE [3] provides values to be used for energy model evaluation: Where CV(RMSE) corresponds to the baseline model, n is the number of baseline observation points, m is the number of M&V period observation points, Us is the measurement sampling error, REinstrument is the measurement instrument error and Uiv is the error of the model energy consumption calculation due to the sampling and measurement errors on predictors. It is possible to define probability distribution for each predictor in relation to measurement and sampling errors during the M&V period. Moreover, with nowadays computing capacities, it is possible to do probabilistic simulations. By sampling randomly values from the distributions associated with each predictor, we can calculate the probability distribution of the energy consumption determined by the adjustment model. Such distribution can be associated with a standard error, which we name SE(propagation). Based on this principle, we have defined global uncertainty as follow:

Metamodel selection
In common M&V practices, in the context of options A B and C as defined by IPMVP, energy models used for adjustment are linear. These models are built from metered data during the baseline period with multilinear regression approach. When using option D, numerical models can also be used using simulations software. Building metamodels from simulation software can be seen as a third way to obtain an energy model to be used for M&V. There is a variety of metamodels and multilinear functions are only one type, which may be too simple and only valid locally in some cases. Other types of metamodels include polynomial regression, multivariate adaptative regression splines (MARS), gaussian process regression (GPR), artificial neural network (ANN), support vector regression (SVR), classification and regression trees (CART) and random forest (RF). They have been used for multicriteria design optimization ( [18], [25]), regression on metered data ( [4], [8]), model calibration on existing building ( [6], [20]), thermal and visual comfort [5] and complex systems thermal behavior ( [9], [14]). In terms of scale, they have been used from building zone level to HVAC system [31], whole building ( [19], [25]), campus ( [27], [32]), city [24], building sector level [13] and country level [21]. In terms of time step, they have been used for hourly ([8], [15], [30]), daily [8], monthly and yearly calculations [25]. Several authors have analyzed the benefits and inconvenient of all these metamodels for building energy consumption calculation application ( [27], [32], [23]). To summarize, these authors show that multilinear or polynomial regression approaches still provide a good balance between accuracy, ease of use and transparency compared to more sophisticated approached that can also be called "black-box". These methods may give more accurate results in some cases but they require an advanced expertise level and require more time to implement. They are recommended for specific applications or situations where data is lacking. We summarize the comparison in table 2. Among the models that can be obtained from regression approach, we have considered quadratic polynomial functions: Such polynomial functions have been successfully used either to approximate building energy modeling results for heating and cooling loads during design phase [25] or to build statistical model with metered data to estimate the effect of global warming on energy consumption [29].  [18], [25]). Design of experiment allow us identifying the significant predictors as well as interactive effects, with statistical tests such as p-value (qualitative) and with regression coefficients (quantitative). Especially, D-optimal DOE are interesting because they aim at optimizing the number of points for a better accuracy by changing the value of all descriptors for each set of data. Such result is achieved by meeting the D-optimality criterion. To carry out this calculation, each factor must be expressed as a dimensionless value X defined by: With X between -1 and +1, x a physical value between xmin and xmax, x0 the central value of the domain [xmin;xmax] and p the variation step, respectively defined by: The D-optimum dimensionless experiment matrix E corresponds to the one that maximizes the matrix determinant E'E.

Descriptors definition
We have analyzed a number of field reports ([2], [7], [10], [26]) as well as the literature ( [20], [33]) to obtain a comprehensive mapping of the relevant descriptors of energy consumption during operation phase. We have classified the descriptors in four categories: climate (temperature, absolute or relative humidity, solar irradiation), space use and process (occupancy types, rates and schedule, process energy use), users' behavior (openings and blinds use, DHW volumes, computer idle mode use…), HVAC systems operation (temperature and flow rates setpoints, maintenance, regulation). We have considered that it is possible to define indicators for each category based on representative averages for each calculation timestep, in a very similar way that degreedays are used to follow-up a more or less cold weather monthly. Table 3 illustrates this approach: Using this type of definition and having analyzed which descriptors are relevant to predict energy use during operation phase allow for the definition of the M&V plan.

Overall methodology
In practice, metamodel and M&V plan development should be done in parallel to ensure relevant data are measured consistently with the model and that the global uncertainty good enough to demonstrate proper energy management during operation phase. Figure 1 illustrates the overall methodology.

Case study
We have applied this methodology on a 11000m² cultural building, located in Paris built in 2015. This case study has been chosen because it is representative of the issue we have identified: its space use is highly variable because it is related to event programs and there is great diversity of spaces. Moreover, HVAC systems operations have multiple constraints that are exhibitions' artworks security oriented. This means that energy performance is not the priority and finding ways to save energy requires accurate understanding of systems operation. We have worked on this building case study since the design phase and have followed-up its performance during operation. With regards to modeling, we have begun by building an energy model on EnergyPlus based on as-built documentation and site visit. The first purpose of this model was to provide an energy performance evaluation for the French green building rating system HQE. Then, we have applied IPMVP option D methodology for energy performance follow-up. The energy model has been adjusted to match data that were measured during 11 months. In parallel, M&V data were analyzed to understand the site's activities and HVAC systems operation. Based on this work and our operation phase descriptors database, we identified relevant descriptors to build a metamodel with the design of experiment method. The metamodel was tested against the energy model and allowed us to rank descriptors in terms of weight on energy consumption. We analyzed and made assumptions about M&V data accuracy and associated probability distributions. Lastly, we could propagate these uncertainties through the metamodel and obtain a global uncertainty in order to determine how to reduce it.

Presentation of the case study
The studied building has "three" skins: exterior shadings that are part of the architectural concept and form an opened assembly though it is large enough to shade most of the building, a closed enveloped called "iceberg volumes" which is composed of Ductal® concrete with 17cm insulation (U=0.27W/m².K) and ventilated volume maintained at 14°C by heat recovery dedicated AHUs, and insulated walls (U=0.33W/m².K) that are in contact with heated/cooled interior spaces. Vertical and horizontal glazing have a Ucw of 1.7W/m².K. SHGC is 43% and 16% for vertical and horizontal glazed area respectively. Thermal zoning has been done in order to calculate realistic heating and cooling loads and total energy consumption. Overall, we followed the thermal zoning methodology described in figure 2.

Fig. 2.
Heating and cooling plant consist in two thermo refrigerating pump (TFP), providing simultaneously hot and chilled water all year round. In heating mode, each machine can deliver 757kW of heating capacity and reject 550kW of cooling capacity that can be recovered. In cooling mode, each machine can deliver 650kW of cooling capacity and reject 777kW of cooling capacity can be recovered.

Highlights about the site's operation and activities and assumptions
In this section we describe the findings about site's activities we made for energy model adjustment. Some information is derived from metered data, some is derived from interviews and spot check, some is assumed and has been one plausible solution for energy model adjustment.
Exhibition galleries are tightly conditioned all year round, night and day. When there is lots of visitors, some galleries are exposed to uncontrolled outdoor airflows that add to the thermal loads. The average equipment and lighting power density in the galleries is 35W/m². they are opened 10h/day but the base lighting is turned on longer, from 6am due to cleaning works. Each gallery has its own AHU with a specific fresh air damper position obtained after system balancing procedure but which was never changed after. The average damper position is 38%. The average supply fan is at 59% of its design frequency. The average supply air temperature is 22.8°C. For all zones that are not exhibition space or the kitchen of the restaurant, average equipment and lighting power density is 14.5W/m², used during 10h30/day. Average supply air temperature is 23°C and average return air temperature is 22.4°C. The average fresh air damper position is at 74%, knowing that some of the AHUs are DOAS type. For kitchen area, we had to make assumption from a similar site, as it was separated from the rest of the project in terms of management, but not for energy supply and management. Average electrical power was taken at 450W/m² with 80W/m² associated heat gains. This is associated with full restaurant capacity (150 persons). We have linked this consumption to the restaurant actual occupancy rate with a 30% fixed minimum consumption. It was found that the restaurant has a 80% occupancy rate in average. Lighting power density was taken at 15W/m². The VAV AHU design supply airflow rate is taken at 20ach with supply air temperature between 20 and 25°C. According to the site's statistics, 2800 persons/day come during weekdays, and 3500 persons/day come during weekends in average. The site is opened to visitors 10h/day six days per week. We have assumed an even distribution of the visitors over the accessible areas and an average metabolic rate of 140W/person. Specific visitors flow to the auditorium has been more difficult to estimate and we obtained a value of 225 person per event in average, once a week during 4h. In addition, process energy use were identified including fountains (42kW), one exterior lighting artwork (24kW), exhaust fans used permanently (30kW), exterior lighting (45kW), escalators (37kW) and pumps associated to plumbing (14kW). Figure 3 illustrates the energy use breakdown of the site as calculated by the baseline energy model:

Fig. 3.
Relative difference between energy bills and modelled consumption over the monitored period, from June 2015 to April 2016 varies from 1% to 13%. From this, we could calculate the RMSE of this model at 30098kWh/month, giving a CV(RMSE) of 7.6%. Considering 11 points of observation and 95% confidence level, we can use the t-statistic at 2.26 to define a modelling uncertainty of +/-68020kWh/month, or +/-17.3%. When using this statistic, we can see that later energy bills (2016 and 2017) fall within this range as shown on figure 4.

Metamodel development
The initial phase of energy model adjustment and site's operation analysis allow us having a base model that is representative of the site's energy use, although it is not a calibration but only one plausible solution among others. Figure 5 illustrates the methodology we have developed to determine the proper structure of the metamodel. To determine the proper timestep, we have found that the average duration of exhibitions was 4 months and that the duration of heating and cooling mode of the thermo refrigerating pumps was 6 months each. Therefore, to be consistent with the site's usage variations we could have chosen a quarterly timestep. However, considering that we had only 11 months of M&V data we chose a monthly time step to have more observation points. With regards to the spatial step, we iterated several times to optimize the DOE sample size. Indeed, the more refined the spatial step is, the more descriptors we have and the larger the sample size will be. We found that it was optimum to define exhibition, non-exhibition and kitchen. Lastly, we grouped all lighting and equipment as one factor for each spatial group, and almost all factors were defined as 24h daily average values. Only internal heat gains daily average were associated with occupied/unoccupied period because we considered the daily duration of equipment's use is a relevant descriptor.

Fig. 5.
We assumed a partially quadratic model to build the learning database with D-optimal design of experiment theory: Table 5 summarizes the sample size optimization, reducing the number of descriptors from 63 to 22: Table 5. DOE sample size optimization process.

Results
For each month, we used the p-value test in order to identify the descriptors that had an effect on the energy consumption. We considered a threshold of 0.01 for the p-value. Table 6 shows the significant descriptors for summer months and Table 7 for winter months. We then applied least square optimization regression to determine the monthly polynomial functions. We show below the example for total energy consumption in June.
We have added the 4 processes energy use to the polynomial function obtained: We could carry out 50000 calculations with the polynomial functions using random samples of descriptors values, for each month. Figure 6 illustrates the calculated monthly energy consumption distribution for December. Due to the unmeasured descriptors, the spread of the possible energy consumption values is large although there is a clear mean value of this classic bell-shaped curve. Figure 7 shows the comparison between the mean value of the calculated energy consumption values distribution and the actual energy bills during this period of time. We can see that there is a good match. The relative difference never exceeds 10% and it is 4.5% for the total over 11 months, which is a good result.

Fig. 7.
Modeling uncertainty is between 5.6% and 8.8%, propagation uncertainty is between 13.4% and 24.9% and global uncertainty is between 15% and 25%. We can see that the uncertainty related to the lack of information on non-measured descriptors exceeds the modelling related uncertainty. Therefore, the first step to reducing global uncertainty is to improve the M&V plan, and especially the equipment and lighting average power and use schedule.

Conclusions
The method was applied to a cultural building, which is in operations. We built a polynomial model for monthly total energy consumption as a function of factors such as the number of visitors, the minimum humidity levels setpoints of exhibitions' specific equipment power density. Modeling error is always less than 10% compared to the EnergyPlus model. We then used available monitored data over a period of 11 months, associated with their uncertainties, to estimate the total energy consumption and compare with real energy bills. Results show a difference of less than 10% between the average value of the predicted energy consumption and the real energy consumption for each month. The global uncertainty of the estimate is between 15% and 25%, with the largest fraction due to the uncertainties related to input data. The results show that this method is adapted to model and monitor energy consumption in relation to building use and HVAC systems operations factors. Operation phase factors can be expressed as daily averages, similarly to heating or cooling degreedays and the structure of the polynomial models can easily be related to the M&V plan. This method can therefore help to better manage energy consumption during operation phase.