Whole building validation for simulation programs including synthetic users and heating systems: experimental design

. A large-scale study for validating building energy simulation programs against measured data was undertaken within IEA EBC Annex 71 “Building energy performance assessment based on optimized in-situ measurements” as a more complex and realistic successor of the dataset created previously in IEA EBC Annex 58. The validation method consists of a set of high quality measurement data and a precise documentation of all boundary conditions. This enables a user to create a complete model of the different validation scenarios. The results of this model can be compared to the real measurement data. Because of the detailed modelling, the remaining deviations should indicate the limitations of the tool under investigation. The definition of the scenarios consists of extensive weather data and a detailed description of the building geometry, components compositions, thermal bridges, air tightness, ventilation, etc. In addition to the previous Annex 58 dataset this experiment contains synthetic users with internal heat and moisture gains, operated doors and windows and underfloor heating with an air source heat pump. This paper sets out the experimental design, a key element in ensuring a useful experimental dataset.


Introduction
There is increasing emphasis in the design of buildings, both new build and retrofit, to reduce energy consumption, at the same time as ensuring good indoor environmental quality in terms of thermal comfort, air quality, lighting and acoustics. This has led recently to an increase in the range of available technologies that designers need to consider. Examples are advanced glazings, ventilation systems, novel heating and cooling systems, renewable energy system integration, thermal and electrical storage, and the integration and control of these systems. Given the complexity of the dynamic heat and mass transfer processes involved, modelling is commonly needed for selection of the most suitable and cost effective energy solution.
It is therefore essential that the modelling programs used to predict energy and internal environmental performance are thoroughly tested (as well as the users of those programs) in order to give confidence in their ability to provide accurate predictions.
Although there have been a number of large international and national validation studies (e.g. [1] and [2]) the majority have focused on inter-program comparisons such as the well-known BESTEST, or on small single room outdoor test cells. These validation studies have been useful for uncovering program errors and limitations of predictive accuracy. However, there is a need to develop some realistic empirical validation test cases of full-scale buildings to provide confidence that the dynamic thermal simulation programs can represent the physical reality, and to quantify simulation uncertainties under realistic boundary conditions. Inter-program comparisons cannot consider effects not included in the models: they do not provide a "truth" model. Simplified empirical validation based on test rooms do provide true physical behaviour, but they do not include multi-zone interactions that may be important in practice, and often their scale does not reflect real buildings. In some cases also the experimental design excludes certain effects, e.g. the air temperature stratification might be avoidance on purpose by mixing fans.
In the IEA EBC Annex 58 "Reliable building energy performance characterization based on full-scale dynamic measurements" [3] a full-scale validation study was undertaken and published [4], [5]. The two fully documented datasets obtained in this study ( [6], [7]) are available and are frequently downloaded and used for validation, training and teaching purposes.
Full-scale empirical validation is complex, timeconsuming and costly, and requires a high quality experimental facility with experienced experimentalists and modellers. To be of validation quality, all flow paths and boundary conditions must be measured, with the building tested through a range of external boundary conditions and internal operation. It is believed that there have been no comprehensive full-scale validation datasets produced from full-scale buildings before Annex 58 and this experiment. The reason for attempting such experiments at this time is a combination of factors that should now improve chances of success: namely widespread availability of sensor and instrumentation equipment, the availability of sophisticated test buildings, knowledge regarding errors in previous experimental programmes and improvements in simulation programs to model low energy technologies to assist in the experimental design.
While the Annex 58 experiment had a rather simple design, focusing primarily on the basic heat and mass transfer mechanisms, the experiment described in this paper is intended to provide realistic boundary conditions for an advanced validation. (Synthetic) users bring several new aspects to a building and the simulations characterising its performance. The users' internal heat and moisture sources are stochastic and interact with the time-varying heat inputs of the heating system. Also the users will change the systems' heating set point temperature actively. All these actions influence the thermal-energetic regime inside the building. To use a realistic occupancy profile, 10 minute data from a Markov chain model, developed from an extensive time use survey [8], was used (see section 3.4.). During two experimental phases (user 2 / 3 in Fig. 4) the roomwise set temperatures of 21°C and 17°C were linked to this occupancy information. An internal heat source profile, based on the identical model [9], was used in the experiment.
One of the lessons learned from the Annex 58 validation experiment was sensitivity of a building's energy balance to factors related to the inside air body. In particular, these were the assumption of well-mixed air inside a single room versus the stratification that occurs in reality [10] and also deviations that were believed to originate in a poor mathematical representation of the inter-zonal air exchange between rooms.
This experiment was undertaken within the framework of the IEA EBC Annex 71 "Building energy performance assessment based on optimized in-situ measurements" [11]. The aims of the experiment are to obtain and apply high quality experimental datasets for the validation of building performance simulation tools. By applying a two-phase validation strategy with disclosed validation goal values during the first (Blind) Phase, user errors are separated from critical model simplifications and program errors.
Given the complexity, it is important that a detailed experimental design is undertaken to define the experiment and instrumentation requirements. This paper sets out the developed design for testing over a full winter period.

Validation Methodology
The overall empirical validation methodology applied in this study was similar to that employed in IEA EBC Annex 58 [4] and other previous validation studies (e.g. [12], [13], [14]). The steps were as follows: 1. Experimental design. Model the selected building using a local climate database. The aim of this phase is to determine building time constants, suitable test sequences, magnitudes of heat inputs and variation in internal temperatures. It includes sensitivity tests to identify important simulation parameters that need to be measured.
2. Experimental set-up. Calibrate and install all required sensors and data acquisition systems, install and check the instrumentation system and program the heating and/or cooling as required.
Develop the specification, which describes all aspects of the building required for modelling. 4. Experiment. Undertake the experiment and process the experimental data.

5.
Blind validation (Blind Phase). Modellers predict internal conditions using the experimental specification, measured climate data and operational schedules but without knowledge of internal conditions. At this stage, additional questions can arise regarding the experimental details -these questions and answers are distributed to all modelling teams. Modelling teams submit modeller reports with details of the programs used, and assumptions made. 6. First stage analysis. This compares predictions against experimental data for internal temperatures and heat fluxes. Inevitably, at this stage, differences are due to a mix of user and modelling error (and potentially measurement errors).
7. Re-modelling (Open Phase). The measured data is disseminated. Modelling teams are encouraged to investigate differences between measurements and predictions and resubmit predictions and updated reports. Only changes correcting user modelling errors or altering a modelling assumption (with documented rationale) are allowed. It is important to ensure that model input parameters are not simply tuned to improve agreement with measurement. This step separates the modelling from the user error by eliminating the user errors.

Final analysis and archiving of high quality data sets.
The intention is that the resulting specification and datasets will be useful for developers of new programs and those improving modelling algorithms, as well as providing evidence of predictive capability of simulation programs.

Test Buildings
The Twin Houses of the Fraunhofer IBP ( Fig. 1) were selected as the most suitable test facility. These buildings were previously used during the Annex 58 experiment and thus allow for a continuous evolution of the models, already created and validated during the previous experiment. Compared to the previous Annex 58 experiment the following aspects were added to the Annex 71 experiment to create a more realistic and thus a more complex but also more comprehensive validation scenario: • Night setback of the heating's set point temperature.
• Set point temperature profile based on a stochastic user model [8]. • Internal heat sources of synthetic users, based on a stochastic user model [9]. • Internal humidity source of synthetic users. It is possible to give the measured temperatures and ask for the required energy demand as the validation goal or to give the measured heating powers and ask for the resulting room air temperatures. Considering the amount of controls and time lags in an underfloor heating system and the related difficulties in simulation, it was decided to provide the heating powers (flow rates and supply temperatures for the underfloor heating) in this validation (except for the co-heating phase) and ask for the resulting temperatures. Fig. 3 show the Twin Houses floor plan including the open/closed doors and windows. As can be seen the attic is one single air volume as is most of the ground floor, except for the kitchen, depending on the internal door's operation status, and the sleeping room. Depending on the experimental phase, the trap door connects or separates the two large air bodies of the Twin Houses. The attic's mechanical ventilation is mass flow controlled and balanced for each room (50 m³/h). The ground floor's supply air (100 m³/h) is injected into the living room and extracted on the building's other side from the dining room and bathroom. This results in air movement across the rooms and the connecting doors.  To allow for a more detailed analysis of the infiltration and the inter-zonal airflows two tracer gases are used. CO2 is injected into child 1 and SF6 into the living room. The resulting concentrations of both gases are measured in the living room, in child 1, in the kitchen and in the dining room. The accumulation and decay of the concentration of these gases should allow for an analysis of the interzonal airflows. The measured concentrations are provided together with the other measurement data. Since SF6 is about 5.5 times denser than air, the SF6 concentration is measured at two different heights in the living room. Since CO2 is a natural component of the atmosphere, a sixth detection point was installed into the building's supply air duct to be able to consider the baseline concentration.

Quality control and baseline measurements
All sensors were calibrated before the start of the experiment. The accuracy of the sensors and their data acquisition system were documented and provided for every measurement point. The airtightness of both buildings was measured and compared directly before the experiment. As can be seen in Table 1 the overall buildings' mean air tightness is 0.87 h -1 and 1.10 h -1 at 50 Pa pressure difference. From these measured n50values, infiltration air change rates of 0.077 and 0.061 h -1 can be estimated, assuming 7 % of n50 as average infiltration [15]. Together with the buildings' internal air volume of 337 m³ this means a difference of 5.4 m³/h. Combining the mechanical ventilation of 200 m³/h and the estimated mean infiltration of 23.2 m³/h the absolute difference 5.24 m³/h means a relative difference in both buildings' air exchange of 2.4 %. The co-heating test ( [16], [17], [18]) at the experiment's start also served as a baseline measurement comparing both buildings' energy consumption under identical boundary conditions. The evaluation of the coheating test resulted in heat transfer coefficients (HTC) of 107.4 W/K and 111.7 W/K for the O5 and N2 houses respectively (based on daily averaged data). So a difference of 3.9 % between both buildings' energy consumption can be expected. The solar apertures were 10.2 and 11.4 m² for the O5 and N2 houses respectively.

Synthetic users
To ensure the experimental analysis incorporates the influence on heat gains of a diverse and representative range of occupant behaviours, a stochastic modelling approach was used to develop realistic occupancy profiles, which were then used to determine occupant heat gains, and indirectly to first generate appliance and hot water use profiles, and then associated heat gains. These unique occupancy, occupancy-driven electrical and hot water demand, and heat gains profiles were generated using the occupant-differentiated probability-based approach [19]; this used UK Census, Time-Use-Survey and monitored demand data to generate model calibration data that are differentiated by a number of key household characteristics (e.g. household type/age, income, tenure, etc.) to capture the higher-level behavioural variations. The behaviour variations between similar households is also incorporated, with further calibration data ensuring the model output reflects individual household behaviours and not a composite of the group behaviour.
The profile generation process uses a bottom-up approach, with first the occupancy model being generated, which is then used to determine individual use timing and demand profiles for each of the appliances probabilistically assigned to the household based on ownership data, and for hot water use. The occupancy model uses a Markov-chain approach with further statistical manipulations to replicate individual household behaviours. The appliance and hot water models uses an event-probability approach, where the number of events per day are determined probabilistically and then the timing in relation to the predicted occupancy. Flett [19] describes the profile generation process in detail.
Heat gains are generated in relation to the occupant state (i.e. if awake or asleep) and energy input, including a delayed release of generated heat from appliances with high thermal inertia, such as a cooker, and a 50% allocation for hot water to allow for heat lost in the drained water. The calculated heat gains are apportioned to each room within the experimental house. For occupancy and lighting gains, a probabilistic model based on Time-Use-Survey data is used to determine the activity and room location of each individual. Appliance-related heat gains are added based on the expected location and hot water gains proportionally split between kitchen and bathroom based on volume used. For this experimental analysis, the number of bedrooms was set at three, and the number of adults and children at two each. For the sensitivity analysis described in section 3.7 100 sets of output data were generated, with all other household characteristics allowed to vary probabilistically. Analysis of typical output variability against real data in Flett [19] has shown that this number of results should provide a representative range of behaviours for this type of household. For the experiment itself one randomly selected profile was implemented into the Twin Houses PLCs.

Underfloor heating system
The Twin Houses' ground floors are equipped with a typical wet screed underfloor heating, so the piping is inside the floor's screed. The attics are equipped with a dry screed system. The piping is fixed into an insulation layer, covered by a 25 mm dry screed board. Aluminium heat conducting lamellas between the pipes and the screed board optimize the system's heat transfer to the room. The heat source is a standard air-to-water heat pump serving not only the heating system but also the synthetic users' domestic hot water tappings.

Experimental schedule
To facilitate the possibility of a side-by-side experiment, as is possible at the IBP's Twin Houses, only a single parameter was chosen to be the difference between both buildings.
For the "Main Experiment" this is the heating system. While the Reference Building (N2) was heated with power controlled electrical convectors (i.e. the underfloor heating system was not used), as in the Annex 58 experiment, the Test Building (O5) was heated with a hydronic underfloor heating powered by an air source heat pump.
In the "Extended Experiment" both buildings were heated identically with electrical convectors. In the Test Building internal moisture sources were added to the living room while the Reference Building was still "dry".
In these two basic experiments, there were several phases, as can be seen in Fig. 4. The following phases were contained in the Main and the Extended Experiment: 3.6.1. Co-heating Phase During this phase, also used as the Main Experiment's initialisation to equalise thermal storage in the two houses, the indoor air temperature was kept constant at 21°C by electrical heating in both houses while the air is uniform because of mixing fans. The mechanical ventilation and synthetic users were off in this phase. The data from this phase were analysed to compare both houses' heat loss coefficients as a part of the baseline measurements. During the Blind Validation the measured air temperatures (at 110 cm) were provided to the modelling teams but not the heat inputs. Also, the calculated heat transfer coefficients were not released to modelling teams until after the Blind Validation.

User 1 Phase
All rooms were heated identically with a fixed night setback between 11 pm to 6 am; the Test Building's underfloor heating was operational and the synthetic users' internal heat gains were active. Ground floor and attic space were separated by the closed trap door. During the Blind Validation the room air temperatures were disclosed to the participating modelling teams. For the Reference House the electrical heating powers were provided. Time-varying supply water temperatures and flowrates were provided for the Test House's underfloor heating for each room.

User 2 Phase
In the User 2 Phase the rooms' set temperature profiles were provided for each room, following the occupancies of the synthetic users. These users also operated the internal kitchen's door and the external child 1's window; the trap door was open permanently.

(Re-)Initialisation Phase
After the User 2 Phase, between the Main and the Extended Experiment, there was another initialisation phase. Both Houses were heated electrically again at a constant set temperature with no synthetic users. Also during the Blind Validation, all data are available to the modelling teams.

User 3 Phase
The User 3 Phase marked the start of the Extended Experiment. From now on both houses were heated electrically and the Test House's living room had an internal moisture source aligned with the occupancy profile. Otherwise this phase was identical to User 2. During the Blind Validation the electrical heat inputs were available while the room air temperatures were disclosed.
3.6.6. PRBS Phase During this phase the heating inputs and the internal heat sources were replaced by heat inputs following a Pseudo Random Binary Signal (PRBS). The signal magnitude was determined in the experimental design by a simulation of the experiment. There are no influences from the synthetic users except for the moisture source. The trap door is closed for a part of this phase. This PRBS Phase is primarily intended for the training of low order building models. During the Blind Validation the electrical heat inputs were available while the room air temperatures were disclosed.

Free-float Phase
In this last experimental phase all aspects of the synthetic users were active except the external window in child 1 and there was no heating, so the room air temperatures were free-floating. During the Blind Validation the synthetic occupant heat gains were available while the room air temperatures were disclosed.

Sensitivity analysis
Building Performance Simulation (BPS) was used to assist the experimental design of the full-scale empirical validation exercise. The Fraunhofer Twin Houses were simulated using EnergyPlus V8.8 [20]. The input parameters used in the simulation models were specified based on up-to-date information for the buildings, including post-construction drawings of building geometry, construction details of existing fabric, infiltration rates measured with a blower door test, among others [21]. An additional simulation based on WUFI Plus TM [22] was deployed to design the extended experiment's internal moisture source.
Deterministic simulation was initially used to replicate the actual experiment, acknowledging that almost all input parameters fed to the simulation models were subject to a certain level of uncertainty. To overcome this issue, a Sensitivity Analysis (SA) using the method of Morris ( [23], [24]) was employed, as a screening method, to indicate which input parameters have the most significant impact over simulation predictions and which factors need to be measured more accurately in preparation or during the monitoring experiment. The Morris sampling method shows the overall influence of the input parameters on the results (i.e. absolute value of μ* - Fig. 5), as well as the monotonic behaviour in the model (graphical representation of σ vs μ* - Fig. 6) ( [24], [25], [26], [27]). If the input factors are positioned below the σ/μ* = 0.1 line then their behaviour is considered linear. If the input factors are positioned between the lines σ/μ* = 0.1 and σ/μ* = 0.5 then they are monotonic. If the input factors are between the lines σ/μ* = 0.5 and σ/μ* = 1 they are almost-monotonic. Finally, if they are above the σ/μ* = 1 line they are considered highly non-linear and non-monotonic [27]. A list of uncertain parameters was created, considering what will/can be measured as part of the experiment and what information will be released to the modelling teams of the empirical validation exercise. Their base values were specified to the best of existing knowledge at the time of the analysis. A uniform distribution with a fixed relative range of 20% was assigned to each parameter, as an initial estimate, in the absence of more certain information. The SA was performed using Python SALib [28]. 570 simulations were conducted using JEPlus 1.7 [29]. The results of the SA showed that the most influential input factor was the specification of thermal bridges in the model (Fig. 5), having a linear effect on the heating demand (Fig. 6). This ranking obtained in Fig. 5 was used to identify the experimental aspects with the most significant contributions to uncertainties, allowing for a systematic improvement of the initial experimental design. Consequently, a higher number of junctions were analysed in detail as part of the experiment than was initially intended.  The mechanical ventilation supply flow rate of the living room and the attic space were also found to have a significant impact and a linear effect on the sensitivity of the simulation output. Recognising, however, that this is a parameter that can be specified and measured with high precision during the actual experiment, this observation did not alter the experimental design considerably. Finally, the temperature of the cellar boundary condition and the hot water flow rate of the underfloor heating system were found to be two more important parameters; a finding that resulted in parts of the instrumentation undergoing a second calibration process (i.e. the underfloor heating flowmeter) and further temperature and heat flux sensors being installed in the cellar.

Conclusions
Undertaking a comprehensive full-scale validation study requires a large commitment in time and resources. Key requirements are:  A high quality, fully documented test facility with an experienced experimental team  Several modelling teams with experienced modellers using a range of simulation programs.  Realistic test sequences that cover a range of internal and external conditions.
Attention to experimental design is critical in ensuring the resulting datasets are fit for purpose. A pragmatic procedure was adopted to cover the following aspects:  Determination of the main influencing factors on performance, varying them through a realistic range.  Inclusion of random elements which cover the range of conditions expected in "real life". For weather it means covering an extended period and for occupancy it means making sure the magnitude and stochasticity are realistic.
 Ensuring the variable factors have a significant effect on the "independent" metric used. This could be temperature (e.g. in a free float period) or heat input to maintain a setpoint.  Ensuring all important influencing factors are measured to a sufficient level of accuracy. This was investigated through sensitivity analysis.  Reducing measurement error through calibration and data checking.  Full documentation of the experimental specification and measurements.  Use of side-by-side experiments to focus on one or more important influencing factors.
The experimental design benefitted from the experience gained in Annex 58, with an evolution in complexity level. This led to additional sensors to measure air temperature distribution for monitoring stratification, and airflow instrumentation to monitor inter-zone air exchanges.
The presented experiment provides a comprehensive dataset including detailed documentation that can be used either for the validation of Building Energy Simulation (BES) programs or for teaching and training purposes. This experiment complements the experiment already carried out in IEA EBC Annex 58 by adding synthetic users, a hydronic underfloor heating system, tracer gas measurements and more complex internal airflows.
The deployment of a simulation of the intended experiment proved to be of high importance in setting several parameters during experimental design, for example the amplitude of the heat inputs during the PRBS phase and the design of the internal moisture source. In addition, the sensitivity analysis, derived from the EnergyPlus simulation, revealed several flaws in the initial design that could be avoided with the information gained and the resulting changes to the experiments design, documentation and instrumentation used.
A Blind and Open Phase approach, separating user and program errors, is currently underway with the participation of a number of modelling teams using a variety of detailed simulation programs. Since the Open Data are already released, a truly blind validation is no longer possible. Nevertheless, this Validation Exercise can continue to serve its purposes very well. Since all data are available now, a validation team can choose its validation goals freely. This allows for an adaption of the validation procedure to be more suitable for the individual validation task of the simulation researcher or engineer.
The German author would like to acknowledge support from the German Federal Ministry Economic Affairs and Energy for funding the experiment and additional parts of the work described in this publication under the reference number 03ET1509A.
The analysis of the co-heating test was done by Dr. Richard Fitton and Dr. Alex Marshall from the University of Salford, Manchester, UK. The WUFI Plus TM simulation including the design of the internal moisture sources was done by Matthias Pazold for Fraunhofer IBP's Hygrothermics department.