Fault detection in HVAC systems using a distribution considering uncertainties

Detecting and diagnosing faults that degrade the performance of heating, ventilation, and air conditioning (HVAC) systems is very important for maintaining high energy efficiency. The performance of HVAC systems can be evaluated by analyzing monitored data. However, data from a HVAC system generally includes uncertainties, which renders monitored data less reliable. Then, we focused on uncertainties and a calculated performance distribution. The uncertainties from sensors, actuators, and communications were modelled stochastically and were incorporated into a detailed simulation. The system coefficient of performance (SCOP) was used as a performance indicator, which is defined as the ratio of suppled heat to total power consumption. The SCOP distributions over the course of representative weeks in 2007 and 2015 were calculated by repeating the simulation 2,000 times with different uncertainties. Regarding the results for 2015, the 90% confidence interval of the distribution was -4.9% to 5.8% from the SCOP value without uncertainties. The SCOP value determined from the monitored data in 2015 was outside of the low end of the distribution though that in 2007 was inside of the interval. Through an analysis of the monitored data, it was found that fault detection is possible by comparing the monitored data with the distribution.


Introduction
Since heating, ventilation and air conditioning (HVAC) systems account for a large proportion of the total energy consumed in buildings, they must maintain high efficiency. However, it is difficult to avoid faults that deteriorate performance. Faults in commercial buildings are assumed to cause efficiency reductions ranging from 5% to 30% [1][2][3]. Therefore, the use of Fault Detection and Diagnosis (FDD) is very important [4].
FDD methods are primarily classified into three types: abnormal detection from historical data, rule-based methods using expert knowledge, and model-based methods [1]. Although these methods have different features, they all utilize monitored data as input values. The monitored data is regarded as a set of reliable values from which the true status of the system can be determined.
However, HVAC systems generally have uncertainties, which reduces the reliability of the monitored data [5]. A strategy for detecting sensor errors before applying FDD methods has been investigated. However, the uncertainties targeted in this research are inevitable due to limited equipment accuracy, and these uncertainties are different from the errors targeted in prior research [6].
From this analysis, system performance considering the uncertainties was calculated using a detailed simulation, and a fault detection method using the performance is proposed in this paper. We focused on not FDD but fault detection because it is necessary to detect the presence of faults before applying FDD which locates faults. The target system was the water side of an HVAC system in a large office building, called a heat source system. A simulation of the target system was constructed to calculate performance fluctuation due to uncertainties. Furthermore, the uncertainties in the system were modelled and were incorporated into the simulation. Subsequently, the performance distributions of the system were calculated from Monte Carlo simulations with different uncertainties.

System description 2.1 Target building and system
This research was conducted on a real system in a real building. The target building is located in Tokyo, Japan, which was completed in the autumn of 2006. The building houses offices and its total floor area is approximately 162,000 m 2 .
The target system is the water side of an HVAC system in the building, which is called a heat source system (Fig. 1). It has 4 chillers with approximately 12 MW total capacity, 20 pumps, heat exchangers, and water thermal storage tanks with approximately 6,500 m 3 capacity. In addition, it uses sewage heat instead of cooling towers.

Control system
The heat source system has water thermal storage tanks, thus the control logic is complex. At a fundamental level, it stores heat at night, and discharges heat during the day. The depth of the tanks is 6 m, and the temperature in the tanks was measured at 12 points along the vertical direction. The residual heat charge amount was calculated from the amount of water in the tanks, the temperature in the tanks, and the reference temperature. Based on the residual heat charge at 22:00 and the sewage temperature, the operating loads of the chillers for heat storage were determined in the integrated controller in order to ensure efficient operation.
In addition to the storage and dissipation, the system has small control loops that use proportional-integral (PI) control (1), which is commonly used for feedback control in building services as follows: where ‫ݐ‬ is time, ‫)ݐ(ݑ‬ is the indicated value, ‫)ݐ(݁‬ is the control deviation, and ‫ܭ‬ and ܶ ூ are the coefficients for proportional and integral terms, respectively. As an example, inverter frequency of a chiller chilled water pump is controlled with the PI, thus ensuring the measured value of the flow rate remains at a set value. Further, the heat exchanger pump is also controlled with the PI, ensuring the measured the temperature reaches a set value. These PI controls are operated using direct digital controllers (DDC).

System performance
The system coefficient of performance (SCOP) is the ratio of heat supplied to the building to the total power consumed by chillers and pumps. The SCOP for the system was used as performance indicator in this study. Fig. 2 shows weekly SCOP values from May to September in 2007, which is the first summer in which the system was in operation. This data was gathered from the monitored data in the building energy management system (BEMS). The SCOP was different each week, because system performance changes with boundary conditions such as heat load and sewage temperature. This makes fault detection difficult, because there are no methods for judging what level of performance is appropriate against the boundary condition.

Heat source system simulation
The heat source system simulation used in this research was coded in MATLAB based on the equipment and design specifications of the target system. The calculation time step was set to 1 min, and the input items were the actual load, sewage temperature, chiller operation order, and some set values.
Flow rate was calculated while considering pressure and flow balance in the pipe network. Moreover, the flow balance calculations were based on Kirchhoff's low. The total pump head and flow rate were determined from the specification curve, which was reshaped based on an    inverter frequency (Fig. 3). The pressure loss in pipes was calculated using the Darcy-Weisbach equation (2), and the pressure loss at valves was calculated based on the opening degree and flow rate as an equal proportion as follows: The temperature in the tanks and heat exchanger were calculated theoretically, and the outlet temperature of the heat exchanger was calculated using equations (3)-(5) as follows: where ܳ is the exchanged heat [W], ‫ܭ‬ is the heat transfer coefficient, ‫ܣ‬ is the heat exchange area [m 2 ], ‫ܦܶܯܮ‬ is the logarithmic mean temperature difference [°C], ‫ܩ‬ is the flow [kg/s], and ܿ is the specific heat at constant pressure. The subscripts ℎ and ܿ refer to the hotter and the colder side, respectively, and the subscripts ݅݊ and ‫ݐݑ‬ refer to the inlet and outlet, respectively.
The performance of the chillers was calculated using the specification curve (Fig. 4). The control logic, such as heat charge and discharge, and PI controls were also incorporated in the original system. We incorporated the threshold and waiting time for controllers to change numbers of operating pumps. Finally, 102 items such as flow rate, temperature and power were output. The monitored data and simulation results of the chilled water temperature and flow rate over the course of a representative week are compared in Fig. 5. It was confirmed that the behaviours are similar, and the simulation results exhibit the same phenomenon as in the real system. Because both data correspond to 1 min intervals, their values fluctuate sharply when the number of pumps control is performed.

Uncertainty modelling
Uncertainties in the heat source system were categorized into three types ( Fig. 6): uncertainties from the sensors (Type1), from the DDC (Type 2), and from the actuators (Type 3). Actuators are control objects which includes valves and inverter frequency in the pumps. Data collected with the BEMS that was available for monitoring (monitored data) can be regarded as resulting from these types of uncertainties.
These uncertainties were assumed to be normally distributed with average and variance conforming to the equipment accuracy. This is inevitable, despite the fact that the equipment does not have faults. Uncertainties in    the sensors are equivalent to random errors that occur during measurement [5]. Therefore, the SCOP values calculated from the simulation results in the presence of uncertainties can be regarded as a value without faults because the uncertainties are not perceived as faults.
There are two kinds of uncertainties: full scale (FS) and reading (RD). FS is associated with the allowed ranges for various values, as defined in equation (6); RD is associated with the read value, as defined in equation (7) as follows: where ‫ݔ‬ ଵ is the value with the uncertainty, ‫ݔ‬ is the value without the uncertainty, ‫݀݊ܽݎ‬ is a random number drawn from a standard normal distribution, and ߪ is the standard deviation of the uncertainty.
To model uncertainties, the standard deviation σ was set to half the equipment accuracy given in Table 1. These parameters and uncertainties were taken from the Japanese Industrial Standards and similar sources [6]- [10]. Approximately 250 uncertainties were incorporated into the simulation. Values with uncertainties (measured values) and values without uncertainties (true values) are calculated in the simulation (Fig. 7). Conventional simulations that do not consider uncertainties only calculate true values because they do not distinguish true values and values with uncertainties. In the real systems, because the state can be grasped only by monitoring, true values can never be obtained. This simulation model can calculate the control state that results from the interaction between various uncertainties.

Monte Carlo simulation
Monte Carlo method was performed to calculate SCOP by combination of various uncertainties and to obtain the distribution. Monte Carlo method is a computational algorithm based on repeated random samplings. In this research, the simulation was repeated 2,000 times with different uncertainties. It should be noted that values of uncertainties were not changed at every calculation step but were changed at every 2,000 simulations. This is based on an assumption that values of uncertainties do not change in short period.

Distribution of weekly SCOP in 2007
Fig . 8 shows the distribution of SCOP calculated from the Monte Carlo simulation results. Its 90% confidence interval and SCOP were determined from the monitored data. The simulation period was set to a summer representative week (from 22:00 on July 28 to 21:59 on August 4, 2007). The 90% confidence interval was 4.69 to 5.20 even though SCOP value determined from the monitored data was 5.05. As a reference, the SCOP value       determined from the simulation without considering uncertainties was 4.96 and the average value from the distribution was 4.94. Based on the 90% confidence interval, one can assume that the system does not have any obvious faults that severely affect SCOP. Five other distributions were also generated to provide a comparison with the above result. One was generated from the same simulation corresponding to Fig. 8. However, SCOP was calculated from the true values (Fig.  9). It has long tail on the left side and the average of SCOP was 4.95, which was slightly smaller than the SCOP value without uncertainties.
Because the SCOP value in Fig. 9 was calculated from the true values in the simulation result, the distribution in Fig. 9 cannot be reliably compared with the monitored data. However, the longer tail on the left side and smaller average than SCOP without uncertainties imply the uncertainties worsen SCOP.
One distribution was generated from the simulation result without considering uncertainties by adding the uncertainties to items affecting SCOP such as power consumption, chilled water supply and return temperature, and chilled water flow rate (Fig. 10). Here, as in Fig. 8, the uncertainties were based on the parameters described in Table 1 and were changed every 2,000 calculations. The distribution in Fig. 10 is narrower than in Fig. 8, because it did not consider the interaction of uncertainties in the system. In addition, it does not have a longer tail and biased average as a matter of course. Therefore, the distribution in Fig. 8 is appropriate for fault detection.
Figures 11 to Fig. 13 show cases where uncertainties occur only at sensors, actuators, communication equipment, respectively. Fig 11 is most similar to Fig. 8, which means uncertainties at sensors are the major uncertainties in the system. In addition, uncertainties at sensors interactively influence the controls. The 90% confidence interval in Fig.11 was 4.71 to 5.20, thus incorporating uncertainties in sensors is only effective for simple modelling. Regarding uncertainties in actuators, the uncertainties hardly affect system control over the system, except in the case where it is controlled to 0% or 100%. This occurs because the actuator is controlled with the PI controller with reference to the control target and its set value, which is not an actuator itself. As for uncertainties at communication equipment, the accuracy range in communication is narrower than that in the sensors and actuators (Table 3).

Distribution of daily SCOP in 2007
To analyse variations in the distributions caused by operation conditions, the 90% confidence interval during each day of the target week was calculated (Fig. 14). In addition to SCOP without uncertainties, the 5% point was also variated each day. Regarding the range of the 90% confidence interval, the narrowest interval ranged from -4.7% to +5.1% (Tuesday), compared to the SCOP value without uncertainties. For comparison, the widest interval ranged from -8.7% to +10.6% (Saturday). Because the target system stores and releases heat every day based on the heat load, the operation pattern is primarily classified into four types: weekday after holiday (Monday), holiday after weekday (Saturday), normal weekday (Tuesday, Wednesday, Thursday, Friday), and normal holiday (Sunday). Because the SCOP values from Tuesday to Thursday are close to each other, operation pattern influences SCOP and the confidence interval.
Regarding the monitored data, the tendency observed in the fluctuation was the same as that observed from the simulation. However, SCOP on Monday was much higher than the corresponding result from the simulation. This was assumed to be caused by the difference in the set values, which determines when heat charging should end based on the amount of heat stored in the tanks. In the simulation, the amount of heat stored on Monday was     smaller than in the real system, and the reverse was true on Tuesday. This produces the difference in Fig. 14.

Distribution of supplied heat and total power in a day of 2007
Wednesday is a normal weekday for the system and the SCOP value derived from the monitored data was found to be appropriate (Fig. 14), thus the time series data for Wednesday were analysed (Fig. 15).
Regarding time series analysis, the monitored supplied heat data was between the lower and upper 5% points, and was nearly the same as the value without uncertainties (Fig. 15. (a-1)). This occurs because the supplied heat determined from the monitored data was input to the simulation. However, the total power was varied, as confirmed from the lower and upper 5% points (Fig. 15. (b-1)). The uncertainties influenced the heat storage and discharge control, which changes control of refrigerators.
In stable control, both the supplied heat and total power at 13:00 were distributed in a narrow range (Fig.  15. (a-2), (b-2)). The monitored supplied heat was nearly the same as the value without uncertainties, but the total power was lower than the value from the distribution. Although the control states at 13:00 were close, power consumption was inconsistent because of variations in factors such as pump efficiency, pressure loss.
Control around 17:30 was unstable in the simulation because the refrigerators started operating (Fig. 15. (a-3), (b-3)). The time step of the simulation was 1 min, whereas PI control requires tens of minutes in order to produce a drastic change. In addition, it should be noted that the monitored data was converted into 15 min segments.
Considering results in Sections 4.2 and 4.3, even though weekly SCOP is within the distribution, a real system could operate differently from the behavior shown in the simulation results, which represents the ideal control states. In this case, the difference is not so large that it can be regarded that obvious faults occurred in the system from the perspective of SCOP. When managing multiple buildings simultaneously, the distribution shown in Fig. 8 is effective for evaluating the performance of all the buildings. Then, buildings can be managed more efficiently by preferentially analysing data from buildings whose monitored data are far from the distribution.    Fig. 16). Therefore, the real system has faults from the perspective of SCOP.

Fault detection in a week in 2015
The periods in Fig. 8 and Fig. 16 were different years, but the seasons were almost the same. However, the SCOP value without uncertainties in 2015 was lower than that in 2007 by 0.31 (6.3%). This is assumed to be caused by the boundary conditions, such as load and sewage temperature. Using the proposed method, it is possible to evaluate system performance while considering the boundary conditions that changes every day.
As an example of fault analysis, the estimated heat exchange areas from 2007 and 2015 were compared (Fig.  17). In the simulation, the outlet temperature at the heat exchanger was calculated based on heat transfer principles (equations (3)- (5)). Using the same principles and the monitored data (that includes inlet and outlet temperatures and flows at the hotter and colder sides, respectively), the exchange areas were estimated every 15 min. It should be noted that the heat exchange area itself does not change. As a parameter representing the performance of the heat exchanger, the heat exchange area was estimated from the measured value on the assumption that the heat transfer coefficient did not change. It is clear that the estimated heat exchange area in 2015 is smaller than that in 2007. If the estimated area at the heat exchanger named CHEX (see Fig. 1) becomes small, higher flow is required to transfer the same amount of heat, thus, increasing the required pump power. The reason the area at night is lower than that during the day is that the amount of heat exchanged at night is very small.
The proposed method is effective for detecting faults in the system. However, it cannot diagnose what kind of faults occurred in the system. Therefore, FDD is the next step required in this research.

Conclusion and implications
The distribution of SCOP values for a heat source system over the course of a representative week was elucidated using a detailed simulation, which incorporates interactions among uncertainties in sensor measurements, DDCs, and actuators. The 90% confidence interval of the distribution was approximately 10% of the SCOP value calculated without uncertainties. Performance evaluation and fault detection were performed using the distribution and its confidence interval.
Regarding the distribution, the effect of uncertainties at the sensors was dominant compared to those at the DDCs and actuators. The uncertainties at the actuators hardly affected the SCOP value because the true value at the actuator is different from the indicated value, while the actuator is PI controlled with reference to the control target. Therefore, uncertainties in the sensor measurements are the most important factor determining whether a heat source system can operate efficiently.
In addition, distribution analysis was implemented over the periods of a week, day and minute. One day and minute were too short to provide an accurate comparison with the monitored data, because the control state can be easily changed by modifying the set values, which do not cause severe degradation.
However, there is a possibility that this method cannot be used to detect faults whose influence is small. Therefore, this method is effective for detecting faults that have a large influence on the system. Further, this method is effective in the case where multiple systems are managed simultaneously and a system with significant the performance degradation should be analysed preferentially.
FDD is required to for locating and eliminating faults, respectively, to achieve the most efficient operation of a heat source system. The proposed method is appropriate for detecting degraded system performance. Therefore, fault diagnosis and elimination will be the focus of future research.