Comparison of model identification techniques for MPC in all-air HVAC systems in an educational building

. In school and office buildings the ventilation system has a large contribution to the total energy use. A control strategy that adjusts the operation to the actual demand can significantly reduce the energy use. This is important in rooms with a highly fluctuating occupancy profile. However, a standard rule-based control is reactive, making the installation 'lag behind' in relation to the demand. A model predictive control (MPC) might be a solution. To implement an MPC control first a suitable model must be identified for reliable predictions of room temperature and CO 2 concentration. For CO 2 predictions three scenarios are proposed respectively counting camera, lecture schedule and motion sensor. Two model identification techniques are evaluated: ARX and RC models. For identifying the heating dynamics of the case study building the 3 state RC model showed a good performance , a 5 step ahead prediction on a 15 minute time interval indicated a RMSE of approximately 0.60 °C. The 3 rd order ARX model indicated similar results, however the cross validation demonstrated that the RC model outperforms the ARX model. For CO 2 predictions the counting scenario resulted in the most accurate n-step ahead predictions. The RMSE found for the RC model is at maximum 90 ppm while 140 ppm for the ARX model. RC models are recommended for modelling all-air HVAC systems attributed by the higher prediction accuracy over ARX models. In addition, these models still contain physical parameters compared to ARX models.


Introduction
The first recast of the EPBD required that from 2020 all new buildings in the EU have to be nearly zero energy buildings (nZEB) [1]. In school and office buildings, the ventilation system has a large contribution to the total energy use [2]. A control strategy that adjusts the operation to the actual demand can significantly reduce the energy use. This is important in rooms with a highly fluctuating occupancy profile, such as classrooms and open offices. However, a standard rule-based control strategy is reactive, making the installation 'lag behind' in relation to the demand. As a result, a good indoor climate is not always guaranteed and the actual energy saving potential is lower than predicted.
This study focuses on nZEB where slower reactions towards disturbances are expected as a result of a high insulation and air tightness of the building envelope. Furthermore, internal heat gains have a higher impact in these kind of buildings. In addition, there can be a discrepancy between the heating demand and ventilation demand. A model predictive control (MPC) might be a solution as an MPC takes into account the current situation and the future demand. MPC has already shown savings for hydronic systems in operating buildings as indicated in recent studies [3], [4]. In current literature already some studies focused on MPC for ventilation systems. Huang, Wang and Zu [5] created a robust MPC for a ventilation system with variable airflow volume (VAV). The control strategy resulted in a more robust control compared to a PI control. The MPC was able to satisfy constraints when used for temperature control. A recent study by Liang et al. [6] focused on MPC for a HVAC system with VAVs. A low order state space model was developed and a Kalman filter was applied for state estimation. Simulations showed savings for MPC of 17,5% on the electrical energy consumption.
To implement an MPC control for all-air HVAC systems, first a suitable model must be identified that can be used for future predictions of room temperature and CO2 concentrations. The goal for the MPC is to control both the indoor air quality (IAQ) and the thermal comfort while minimizing the operation costs of the air handling unit (AHU). In order to achieve this, two parameters have to be controlled: the air flow rate and the supply air temperature. To achieve this, suitable models need to be identified to capture the relevant dynamics of the building and systems for reliable predictions of the zone temperature and CO2 concentration.
In current research, already a lot of attention is devoted to model identification regarding the thermal aspects. The challenge is to obtain a simple but accurate model [7]- [9]. Currently most of the cost and time is lost during model creation and calibration, which is 70% of the costs in this type of projects [10]. Bacher & Madsen [8] showed an approach to identify models for heating dynamics of a building. A grey-box model based on resistor capacitor (RC) models was used to identify suitable models. The RC-model was able to provide detailed knowledge of the heating dynamics of the building. Privara et al. [11] showed different methods to perform model identification focussed on building modelling. In this study different grey and black-box methods have been evaluated for model identification. The main conclusion is that for larger datasets and/or more complex systems a black box method is the only suitable option for model identification. However, since it is a black box model it does not preserve a physical structure resulting in a less accurate model for larger prediction horizons [11]. The advantage is that a black box approach is less time and computational demanding for large datasets. In a final note it is concluded that for buildings with a simpler structure physical models should be used. Moreover, Reynders et al. [12] showed a method for grey box models to reduce the model order complexity while maintaining an accurate model. The reduced order model (ROM) showed good results for simulations compared to detailed models in which the energy use for heating, the heat emitted by the heating system and the air temperature was obtained. De Coninck et al. [13] developed a toolbox for data-driven grey box models. This toolbox automates different steps in the system identification process. Validated models for forecasting could be identified using this toolbox. The obtained models can be used for forecasting and MPC. The generated model for a singlezone showed a good prediction performance with a root mean square error (RMSE) of 0.33K for a 20 day simulation.
The challenge is to identify models that can be used in an MPC controller for predictions of both the indoor temperature and indoor CO2 concentration. The two identified models will be combined into one global model with a multi-objective control. To implement this control action for an MPC, CO2 predictions have to be included in the dynamic model. Therefore, a suitable model needs to be identified including the CO2 related aspects which are not included in the thermal models. Pantazaras et al. [14] focused on the model identification to obtain a controller model with CO2 measurement data. State space models were used to identify models which were able to predict CO2 concentrations within the accuracy range of a good CO2 sensor (i.e. ±50 ppm). Macarulla et al. [15] used a stochastic grey box model to predict CO2 concentrations inside an office. To predict CO2 concentrations, different grey box models were developed with increased complexity for predicting the CO2 concentration. The best selected model was able to predict CO2 concentration with a RMSE of 41 ppm compared to the measurement data. However, both these studies used actual counting data of the occupancy. In practice this data is hardly measured by the building monitoring system therefore other methods and the impact on results needs to be explored.
The outline of the paper is as follows: section 2 demonstrates the method used for the model identification and highlights the case study building. Next section will present the results for the temperature and CO2 predictions. Finally, a conclusion and discussion is given for the used approach and the future application in the MPC.

Method
Model identification is applied to a case study building using measurement data obtained from the building monitoring system (BMS). The following two model identification techniques are analysed: RC models (grey box) and ARX models (black box). With both modelling approaches predictions will be made for the indoor temperature and the indoor CO2 concentration for one zone. This demands a dynamic model including the thermal dynamics, in addition a second model is needed that includes the CO2 related aspects. A coupling parameter is included in both models: mass flow of air. To identify suitable models for CO2 concentration predictions, three scenarios for occupancy estimation are proposed based on available sensors and prior knowledge:  Number of people based on monitoring results of a counting camera  Lecture schedule with indication of expected group size and start and end time of lecture  Occupancy as a result of a sensor for motion detection (1= occupied, 0 = unoccupied)

Model identification Grey box
For the grey box modelling the toolbox CTSM available in R [16] is used. CTSM uses the maximum likelihood estimation (MLE) to estimate the unknown parameters. The model structure used in grey box is derived from resistor-capacitance (RC) networks. The models are formulated as stochastic differential equations (SDE) in a state space structure.

Temperature identification (heating dynamics)
To capture the relevant heating dynamics of the building and to predict the indoor temperature input data is collected from the BMS. The grey box model identification for temperature is based on the (thermal) dynamics. The change in indoor temperature inside a room is expressed by equation (1): The parameters needed for the RC-model and collected during the measurements are: global solar radiation (Qs), ventilation heating power (Qvent), ambient temperature (Ta) and the internal temperature (Ti). The used models in this study are a two state and a three state model illustrated in respectively Figure 1 and Figure 2. For solar radiation the global solar radiation is measured by the local weather station. The heating power here is the heating supplied by the ventilation system as shown in equation (2):

CO2 concentration identification
The grey box model identification for CO2 predictions is based on the principal of the mass balance. The change in CO2 concentration inside a room is given by equation (3): For the motion scenario the CO2 generation rate is multiplied by the motion sensor signal instead of the number of people (P) in equation 3. The scheduled scenario uses the number of people as indicated on the time schedule. The counting scenario uses the number of people as counted by the camera. In all three scenarios the value for the CO2 generation rate is identified by the RC model.

Model identification ARX models
Auto regressive models with exogenous input (ARX) are modelled in R and describe the input effects u(t) on the output y(t). The general structure of an ARX model is expressed in equation (3): Estimation of the one step ahead prediction is performed using the least squares identification technique, multiple step ahead predictions uses the recursive least square method. The ARX model for temperature identification is based on measurement data of indoor and outdoor temperature, solar radiation and heating power. The ARX model for CO2 predictions is based on measurement data of CO2 concentration, occupancy based on the proposed scenarios and the airflow rate.

Building properties
The building analysed is an educational building, built according to the Passive House standard and located in Ghent, Belgium. The building is built on top of an existing university building and contains 4 zones: 2 large lecture rooms, a staircase and a technical room. The lecture rooms have a floor area of 140 m², a volume of 380 m³ and a maximum capacity of 80 students each. A floor plan and a cross section of the building are shown in Figure 3 and 4. The building is used for lectures but at the same time it is a test facility for research on building energyefficiency strategies in a "real use" environment. Therefore, the lecture rooms are thermally insulated from the outside, the neighbouring zones and each other. The U-value of the construction parts are shown in Table 1.
The solar heat gain coefficient for the glazing is 0,52. The window to wall ratio is 26% for NE and 27% for SW façade. Air tightness at 50 Pa of the lecture rooms is 0,29 ACH for the first floor and 0,47 ACH for the second floor.
Moveable external blinds are applied on the windows on the south-western side. Blinds are closed when the incident solar radiation exceeds 250 W/m².

HVAC properties
A demand controlled, balanced mechanical ventilation system is installed with an air-to-air heat exchanger with an efficiency of 78%. VAV boxes are placed in the supply and extract openings in each room controlling the air flow rate based on CO2-concentrations and temperature in the lecture rooms. The maximum air flow rate per room is set at 2200 m 3 /h and the minimum at 400 m 3 /h. Two heating coils of 8 kW are integrated in the supply ducts. The heating production system consists of a condensing wood pellet boiler with an internal thermal storage of 600 l. The maximum heating power is 8 kW and the maximum efficiency is 106 %. A supply fan is installed with a maximum power of 1,57 kW and 1,33 kW for the extraction fan with an efficiency of 71%. Efficiency of the fan motors is 85%. Indirect evaporative cooling (IEC) with a maximum capacity of 13.1 kW is provided. The current settings for the HVAC equipment are as follows: the AHU is operating from 07:30-17:30h during weekdays. CO2 set point for the DCV system in use is set at 1000 ppm. The heating set point for the heating system is set at 21°C. A deadband is active of ±0,5°C on the heating set point. Standby temperature during nonoperating hours is set at 15°C. The air flow rate and/or supply temperature is increased when one of these set points are not met. The heat exchanger is bypassed when the outdoor temperature is above 16°C or when the room temperature in one of the lecture rooms exceeds 22.8°C. IEC is activated when the room temperature exceeds 26°C and continues till the room temperature is lower or equal to 20.5°C.

Measurement data
A set of sensors has been installed to monitor indoor and outdoor conditions. The used sensors during the measurements are listed in Table 2. The building includes a weather station monitoring the main outdoor parameters: global horizontal solar radiation, the outdoor temperature, relative humidity and the wind speed and direction. For the indoor conditions, the indoor temperature, the CO2 concentration and the indoor humidity are continuously monitored. The occupancy of the lecture room is measured by counting cameras. Measurement data is collected during four periods as indicated in table 3 on a 1 minute time-interval. For each setting a training and validation dataset is collected. First setting is an empty room and no control of the blinds. This is done to identify the thermal characteristics of the building. For the second configuration the blinds are controlled to identify the influence of solar heat gains. For the CO2 models an older dataset is used which includes occupancy. The one-minute input data will be averaged on a 5, 15 and 60 minute interval to study the prediction horizon for the identified models and evaluate the chosen time-interval.

Results
In this section the results of the model identification for both the indoor temperature and CO2 concentration is discussed. Identified RC and ARX models are presented, validated and compared.

Indoor temperature identification
For the identification of suitable models for temperature predictions, measurement data from dataset 1, as illustrated in Figure 5, is used as input data.

Fig. 5. Input data for model identification of indoor temperature (dataset 1)
It is important to analyse the residuals (measured value minus predicted value) for the selected fitted models. The model can be regarded as validated when the vast majority of the residuals are within the 95% confidence bound [8].
The cumulative periodogram of the residuals for the 2 and 3 state RC models are illustrated in Figure 6. From the figure it is demonstrated that the residuals for the 2 state model cannot be considered as white noise and thus are not validated. For the 3 state model the residuals are all within the 95% confidence bound and thus white noise. In addition, the periodogram for the validated 3 rd ARX residuals is included in Figure 6, indicating that the residuals can be regarded as white noise.    For the cross validation, dataset 2 is used as new data input for the fitted and validated models. Using the new input data new predictions can be made utilizing the fitted models. This will indicate how robust the identified models are. Figure 8 depicts

CO2 concentration prediction
For identification of suitable models for CO2 prediction data has been used including occupancy, since the 4 previous datasets did not contain occupancy. The input data used for the identification for CO2 models is shown in Figure 9. The complete dataset consisted of 4 weeks of data, two weeks have been used as training data and 2 week for validation. As mentioned before three different scenarios are proposed to measure the occupancy needed for CO2 concentration predictions. These are respectively counting camera, lecture schedule and motion sensor. This input is required to calculate the CO2 production inside the CO2 mass balance as shown in equation (3).
To validate the fitted models the residuals are analysed for white noise. The cumulative periodogram of the residuals for CO2 prediction for both the RC and ARX model are displayed in Figure 10. It is demonstrated that the residuals for the fitted models can be regarded as white noise. For the ARX models a 3 rd order model has been used in all three scenarios. The highest accuracy is obtained by using the counting cameras. In general, the RMSE for predictions is maintained below 100 ppm for all the predictions for the counting camera scenario. The RMSE however is still above the accuracy of a standard CO2 sensor (± 50 ppm). This is attributed by the low complexity of the models since the model only contains one state. Multi step ahead predictions seems to be a problem regarding the prediction accuracy since CO2 concentrations rise quickly when a group of persons enters a room or with varying airflow rates. In addition, CO2 concentration is a fast responsive parameter. The identified models have problems in these predictions resulting in high residuals at the start of lectures. The motion scenario shows the lowest accuracy. This is a result of the fact that the motion signal does not take into account the number of people. Instead the motion signal of 1 or 0 is used to multiply with the CO2 generation rate to calculate the room CO2 generation. In a similar study [15] a RMSE was found of 41 ppm during a 4 day period for a 1 step ahead prediction with a 15 minute time interval. In this study a value is found of 46 ppm for the RC counting and 41 ppm for the ARX counting scenario. Fig. 11. RMSE for the n-step ahead CO2 concentration prediction of the RC model for three scenario's The ARX multi step ahead predictions, illustrated in Figure 12, demonstrates that the schedule scenario is not accurate. A RMSE is found of at maximum 150 ppm for a 5 step ahead prediction. Caused by the fact that the schedule assumes that persons are present during the complete lecture time. However, in reality lectures start later or end earlier which results in mismatch for predictions. For the counting and motion scenario good results are obtained for the 5 and 15 minute time interval. RMSE are respectively 40 ppm and 75 ppm. Both these scenarios contain the information when the room is in use compared to the schedule scenario. In general for all presented CO2 results it is shown that the 60 minute results are not accurate and result in a RMSE of 100 ppm or more. Important information is lost in the averaging process regarding CO2 concentration fluctuations and CO2 production. In addition CO2 is a fast responsive parameter and averaging this parameter in hourly values might result in not reliable predictions. The residuals based on a 1 step ahead illustrated in Figure 13 shows the discrepancies in more detail. It is clearly visible that a motion sensor, used in an RC model, is not accurate for use in CO2 predictions. The residuals during lecture time are at maximum 300 ppm. Best results are obtained using the counting cameras. However, still the maximum difference can be 300 ppm. High residuals are found during the start of a lecture, this indicates that the identified models have problems predicting the CO2 concentration during the first minutes of the start of a lecture. The residuals for the identified ARX models for all three scenarios shows comparable results. Residuals are at maximum 300 ppm. However, the motion scenario indicates the worst results since here more discrepancies are found. Between the schedule and counting scenario no differences are found for the 1 step ahead residuals. However, from the data presented in Figure 12 it is known that the schedule scenario does not perform well for multistep ahead predictions compared to the counting scenario.
In addition to previous results, in Table 4 the RMSE is given for the first day during operating hours (8:30-17:30). These values give more detail about the error of the prediction during hours with occupancy. Here it is shown that overall the RC models are less accurate compared to ARX models, with the exception of the counting scenario. From the data it is also shown that the motion sensor scenario is for both models the less accurate model. A possible solution might be to combine the motion signal with the lecture scheme to account for the "real" starting and end time of the lecture. Regarding the needed accuracy for CO2 predications an accuracy of 50-75 ppm would still be acceptable since this is comparable to the accuracy of a CO2 sensor. As indicated in the results this is feasible with the proposed models. As expected the models containing the most detailed occupancy information shows the best results while the ARX schedule scenario is a good alternative. predictions the counting scenario resulted in the most accurate n-step ahead predictions. However, in reality this information is in most cases not available and is expensive. Therefore the schedule scenario is proposed as a second best alternative. For the RC models this scenario is still accurate enough, but for ARX models the schedule scenario is showing a bad performance with a RMSE of approximately 150 ppm based on a 15 minute 5 step ahead prediction compared to 100 ppm for the RC model. In general the motion sensor resulted in the least accurate models since a motion sensor only indicates that a room is occupied or unoccupied, therefore crucial information is missing concerning the group size. Especially the RC model has problems with the motion scenario with a RMSE of at maximum 175 ppm. While the ARX models were less accurate for the schedule scenario, a maximum RMSE of 140 ppm was found. The most promising model for future application in MPC is the 3 state RC model combined with the CO2 RC model schedule where a motion signal can be implemented to account for the real starting and end time of the lecture. In addition, both the RC models are already formulated as state space models whereas the ARX models still needs a transfer function to be formulated as a state space model. In general, results for both temperature and CO2 predictions indicated that data averaged on a 60 minute time interval is not accurate enough for implementation in an MPC, since the RMSE is approximately 100 ppm or higher. Important information regarding the disturbances is lost in averaging the data over a large time interval. In addition, for application in all air systems this interval is probably chosen too long since ventilation systems are systems with a shorter response time. A 5 or 15 minute time interval seems better suitable for these type of systems regarding the multi-step ahead predictions. For further research, the identified models will be used as dynamic models inside a MPC strategy. The dynamic model will be used to provide future predictions of the indoor temperature and CO2 concentration. These predicted values will be used in an optimization process to obtain a control action for the HVAC system. This strategy will first be tested in a simulation environment to test the robustness of the MPC. Afterwards the MPC will be implemented in a case study to evaluate the saving potential and the operation. The results can be used to optimize the current rule-based control strategy.