Identification of parameters of heat supply facilities using telemetry data

. The purpose of the study is to test the hypothesis about the possibility of using telemetry data to identify heat supply objects. The article considers 4 models approximating telemetry data. Determination coefficients and standard deviations were used to select the best model. The re-siduals were analyzed for randomness, and the absence of shifts and trends. The consistency of the frequency distribution of the residuals with the normal distribution was checked. The significance of the coefficients included in the approximating functions is estimated. Regression analysis was used to obtain the coefficients of 4 models for each of the 7 objects. The Pearson test confirmed the consistency of the distribution of the residuals of one of the models with a normal distribution for all objects. The significance of the coefficients included in all models was confirmed using Student's t-distribution. The proposed models take into account the flow rate of the coolant and the temperature of the outside air. The dependences obtained do not contradict the physics of the process, both in the field of observation and beyond its boundaries. With certain restrictions on the co-efficients of the model, it is possible to obtain numerical values of the parameters of heat supply objects - the average temperature of the indoor air and the required heating load, which confirms the hypothesis that telemetry data can be used to identify the parameters of heat supply objects.


Introduction
District heating systems provide the supply of coolant for the needs of heating buildings.The most common qualitative regulation of heat supply for heating needs involves regulating the temperature of the coolant at the heat supply source in accordance with the temperature schedule [1].The temperature graph developed for each source represents the dependence of the coolant temperature at the outlet of the source t1 and at the inlet t2 on the outdoor air temperature to.
Quantitative regulation is gradually being introduced into district heating systems.This method of regulation involves changing the flow rate of the coolant circulating in the system [2].
 Correspondingauthor:KitaytsevaEH@mgsu.ruE3S Web of Conferences 410, 04009 (2023) https://doi.org/10.1051/e3sconf/202341004009FORM-2023 The complexity of regulating heat release for the needs of heating systems lies in the fact that the objective parameter that allows to evaluate the quality of the heating system is the internal air temperature ti.When designing heating systems, an obligatory step is the calculation of heat losses of premises through external enclosing structures.The indoor air temperature in this calculation is taken depending on the type of room.When calculating the heat load for the heating needs of a building, the value of the indoor air temperature is averaged and determined by the purpose of the building.In any case, the installation location of the temperature sensor ti cannot be objectively determined.
Therefore, as a parameter, the dynamics of which reflects the process of heat supply and heat consumption of objects, one chooses [3][4][5][6] the temperature t2 at the outlet of the heating systems or the temperature at the inlet to the central heating point (CHP).
At the stage of designing the connection of the subscriber to the heating network, the calculated values of Qс are used, obtained for the calculated climatic conditions, the selected thermal characteristics of the enclosing structures and the parameters of the internal air.During the operation of buildings, the thermal characteristics of building envelopes change both in one direction and in the other [7].The building can be completely or partially reprofiled, which in some cases may lead to a change in the requirements for indoor air temperature.The area and heat transfer of heating devices of the heating system may change.The number of heating systems connected to the heating network is also not a constant value.
As a result, the amount of heat required to provide comfortable conditions in the buildings may also differ from the calculated value.
In [8][9][10][11][12], regression analysis is used to construct the dependence of the coolant temperature t2 on the temperature t1 at the outlet of the CHP and on the coolant flow rate M. The resulting approximating dependencies are taken as a standard and are used to evaluate the effectiveness of regulating the heat supply for heating needs.In [13], regression analysis is used to verify telemetry data.
The purpose of this work is to build models that take into account the parameters that affect the process of heat supply, and the choice of the best of models; using the results of regression analysis to identify the parameters of heat supply facilities [14][15][16].
When constructing and selecting models, the following criteria should be used individually or in some combination [17]: 1. the minimum number of independent variables that significantly affect the accuracy of the approximation; 2. the simplest form compatible with reasonable error; 3. reasonable physical grounds ("follows from some law"); 4. minimum sum of squared deviations between predicted values and measurement results.

Methods
When building models, the following dependencies were used, reflecting physical processes: the dependence of the heat flow Q on the temperature difference t1 and t2: the dependence of heat losses Q b on the temperature difference of indoor t i and outdoor t o air: E3S Web of Conferences 410, 04009 (2023) https://doi.org/10.1051/e3sconf/202341004009FORM-2023 the dependence of the heat transfer of heating devices of heating systems Q h on the difference between the average temperature of the coolant and the temperature of the internal air t i : The temperature t 2 is primarily affected by the temperature at the outlet of the source t 1 .In [13] Models I and III are based on the dependence (1), which allowe to assume a linear relationship between temperatures.In [13] the restrictions imposed on the values of the coefficients of the models are given and the scope of the models is defined.
Models I-III do not include outdoor air temperature t o .Based on the dependence (2), models were constructed: In all models, with the exception of model VI, the average flow rate M m = 0.5 (M 1 +M 2 ) is used as the flow rate, where M 1 and M 2 , respectively, are the flow rates in the supply and return pipelines.
If, in models VI and VII, the coefficient b 1 at temperature t 1 is close to 1 or equal to 1, then dependencies ( 9) and ( 10) turn into dependence (2).In this case, the values of other coefficients can be used to estimate the indoor air temperature t i averaged over the observation period.
E3S Web of Conferences 410, 04009 (2023) https://doi.org/10.1051/e3sconf/202341004009FORM-2023 and the calculated heating load Models IV -VII are linear in coefficients, therefore linear regression analysis is applicable to them [17].The least squares method is used as a method for finding coefficients.
For models IV -V, the estimates of the coefficients b are found as a result of solving a system of linear equations: x T xb = x T Y, (13) where

 
; n is the sample size; q is the number of model parameters.For models IV and V -q=4, for models VI and VII -q=3.
For models VI -VII, in which there is no free term, the matrix x and vector b have the form: Table 1 shows the relationships of the independent variables with the measured parameters.The dependent variable Y is the temperature t 2 .
Table 1.Independent variables of models.
When constructing all models for expenses M 1 , M 2 , M m , a scale factor of 0.001 was used.
One of the ways to estimate the accuracy of data approximation is the determinism coefficient calculated by the formula: where i Y is the predicted value; Y -the average value for the entire sample.
The least squares method is based on the assumption of a normal distribution of residu-   .To check the consistency of the distribution with the normal distribution, the Pearson criterion is used [19].
Estimates of model parameters are calculated iteratively.At the first iteration, the entire sample is examined and the residuals are analyzed.All residues that do not fall into the range: where Q 1 , Q 3 are the first and third quartiles of the sample, respectively; the coefficient k is 3 for explicit outliers [19], are removed from the sample, and the process repeats.
To test the hypothesis that the parameter k is not significant (coefficient b k =0), statistics are calculated: where c kk is the k-th diagonal element of the matrix c=(x T x) -1 ; When the condition t k > t 1-/2 for n-q-1 degrees of freedom is met, the hypothesis of the parameter's insignificance was rejected.

Results
7 central heating points (CHP) connected to one heat supply source were used as modeling objects.The sample covered measurements during the heating period.The frequency of data collection is a day.The initial sample size is n = 340.For all objects, measurements were deleted for which the M2 flow rate was greater than the M1 flow rate.
To solve this problem, information concerning the parameters of the secondary coolant intended for the needs of heating buildings is extracted from the archival telemetry data: the temperature of the coolant, ° C, at the outlet of the CHP t1 and at the entrance to it t2; volume flow, m3/ day, in the supply pipeline at the outlet of the CHP V1 and in reverse the pipeline at its entrance V2; the temperature of the outdoor air, ° C, to.
The mass flow rate Mm, t/day used in the simulation is determined by the formula: Where , are the mass flow rates, t/day, in the supply pipeline at the outlet of the TTP and in the return pipeline at the entrance to it; (t 1 ), (t 2 ) -the density of the coolant, respectively, at temperatures t 1 and t 2 is found using a formula approximating tabular data [1]: . (19)   The ranges of the variables M m , t o , t 1 , t 2 , in the final sample, the estimated thermal load Q c of the connected subscribers and the volume of samples n are presented in Table 2.
The flow rates M m (columns 4-5) depend on the object, which is explained by the difference in the connected load of heating systems (column 3).The relative difference in flows (columns 6-7) can serve as a characteristic of the tightness of the system.
The differences in the ranges of outdoor air temperature t o (columns 8-9) for different objects are explained by the difference in the samples.
The maximum and minimum values of temperature t1 (columns 10-11) practically do not depend on the object.Object 6 is an exception, which is explained by the range of outdoor temperature changes for measurements belonging to the sample of this object.The same reason explains the higher minimum value of water temperature t2 (column 12) for object 6.The maximum values of water temperature t2 (column 13) depend on the object.Figure 1 shows the values of the mean square deviations of the values predicted using different models for all objects.Regardless of the object, removing a free member from the model leads to an increase Yi s -compare for models IV -V and VI -VII.Regardless of the object, the accura- cy of data approximation using models IV and V. is practically the same.The determinism coefficient R2 for all models of all objects did not fall below 0.945 (model VI Object 3).The average value of R2 for all models and for all objects is 0.994.Due to the absence of repeated measurements, the adequacy of the model to the real process was checked by checking the consistency of the frequency distribution of the residuals to the normal distribution law.Table 3 shows the results of the consistency check for all objects.The "+" sign indicates positive test results, the "-" sign indicates negative results.
As can be seen from Table 3, only for model VII the consistency of the distribution of residues to the normal law for all objects was confirmed.Table 3. Results of checking the consistency of the frequency distribution of residuals with the normal distribution law.
The frequency distribution of residuals for models IV and VII for Object 5 is shown in Figure 2.This object is chosen as an example, since it has the largest number of negative test results for it.The average value of the residuals for any model is 0.0.The residuals are randomly distributed, there is no trend and no shift.Figure 3  The coefficients of models IV-VII, obtained by the least squares method based on telemetry data for objects 1, 4 and 5, are shown in Figure 4.The choice of objects is explained by the following reasons: for object 1, the consistency of the distribution of residuals to the normal distribution was confirmed for all models, for object 4, consistency distribution was confirmed for models IV and V, and not confirmed for models VI and VII, for object 5 the distribution consistency was confirmed only for model VII (Table 3).Checking the significance of the coefficients showed that all the coefficients of the models are significant

Discussions
The analysis of the residues made it possible to detect outliers in samples of different objects.The outlier could correspond to both a single measurement and some sequence of measurements.For each object, the location of the outliers coincided for different models.Such a coincidence may indicate a change in the operating mode of the system.The "faulty measurements" corresponding to the emissions were removed from the sample, the coefficients were recalculated.The diagrams shown in Figure 4 correspond to the final sample.
The analysis of the residuals did not reveal a trend, a sharp shift in the level and changes in variance (see Fig. 3).
Analyzing the coefficients of approximating dependencies, the following can be noted: 1.The free term -coefficient b0 -is positive for models IV-V for all objects; 2. The coefficient b1, taking into account the influence of temperature t1, varies from 0.71 to 0.90 in the IV-V model.Removing the free term b0 from the IV-V models leads to an increase in b1.Only for object 4 the hypothesis about the equality of the coefficient value b1=1 is confirmed; 3. The coefficient b2 takes into account the impact of the flow.In most of the cases considered, the coefficient b2 has negative values, which corresponds to the physics of the phenomenon -an increase in flow contributes to an increase in temperature.The violation of this rule for object 2 in models VI and VII is explained by a small range of flow variation -for object 2, the range of flow variation is 8% (Table 2), which leads to an increase in the error in calculating the coefficient b2; 4. Coefficient b3 takes into account the influence of the outdoor air temperature to (model V) and the to/M bundle of the outdoor air temperature to and the average flow rate Mm (model IV and VII) or the flow rate in the return line M2. the lowest values of the coefficient b3 are obtained for the model V.The transition from model V to models VI and VII leads to an increase in the coefficient b3 .The use of the average flow rate Mm instead of the flow rate M2 does not significantly affect the values of the coefficients.
For object 4, the hypothesis b1=1 is confirmed.The distribution of residues is consistent with the normal distribution for models VI and VII.These results allow to estimate the indoor air temperature averaged over the observation period according to the formula (11) and the required thermal load according to the formula (12).
The numerical values of the indoor air temperature ti for object 4 are 24.7 °C (model VI) and 22.5 °C (model VII).The obtained values belong to the real range of changes in the temperature of the internal air.The calculated thermal load is 0.952 Gcal/h (Model VI) or 0.919 Gcal/h (Model VII), which is lower than the calculated one and is 75.6% (model VI) or 73.0%(Model VII), respectively.The obtained values do not contradict the conclusions about the excessive release of thermal energy for this object made in [20].

Conclusions
Models IV-VII includes all measured parameters, the significance of the influence of which is confirmed for all objects.
The use of the determinism coefficient and the sum of squared deviations between the predicted values and the measurement results do not allow to give preference to any model.
Checking the consistency of the frequency distribution of residues with the normal distribution allows to give preference to model VII.
Confirmation of the hypothesis b2=1 allows to estimate the parameters of the heat supply facility -the average indoor air temperature and the required load on the heating, which confirmed the hypothesis about the possibility of using telemetry data to identify the parameters of heat supply facilities.

Fig. 1 .
Fig. 1.Mean square deviations Yi s .The maximum value Yi s does not exceed 1.0 ° C (model VI, object 3), which indicates a high accuracy of approximation.Regardless of the object, removing a free member from the model leads to an in-

Fig. 2 .
Fig. 2. Frequency distribution of residuals of models IV and VII for object 5.

Table 2 .
Ranges of object parameters changes.