Predictive analytics for ensuring the autonomy of urban infrastructure socially significant elements

. The article is devoted to the study of publications in the field of using predictive analytics in the construction industry, as well as to ensure the autonomy of urban infrastructure elements using Industry 4.0 technologies. The materials for the study were publications presented in the international database Scopus in the period from 2017 to 2022. It was revealed that the most popular publications relate mainly to the issues of substantiating the cost of investments in construction, predicting the properties of reinforced concrete and concrete structures, using information modeling technologies in integration with machine learning models, including as part of the design of capital construction projects, etc. However, there are no publications considering the use of Industry 4.0 technologies and predictive analytics to ensure the autonomy of socially significant elements of the urban infrastructure or even capital construction projects. In this regard, the issue of determining the sufficiency and completeness of the data that needs to be collected and processed to identify critical deviations of the system and ensure the autonomy of socially significant elements of the urban infrastructure by comparing the reference model of the operation of an object or its elements and measurements collected from the system in the mode real time.


Introduction
At present, the repair of life support facilities, which are socially significant elements of urban infrastructure, is carried out after an accident, i.e., at the time of the emergency or according to the schedules prescribed by the equipment manufacturer or the standard operating life.
At the same time, digital transformation allows the use of new approaches in the management of life-support facilities, for example, predictive analytics methods that make it possible to justify the need to repair the elements of an object based on its state.
Predictive analytics refers to data analytics, i.e., the process of examining, cleaning, transforming, and modelling data to find systemic patterns in a data set to substantiate conclusions and support optimal decision making.
According to the degree of complexity, it is customary to distinguish several types of data analytics, Fig. 1, the first group is responsible for creating a historical database that can be used for further analysis.The second group is responsible for identifying the factors influencing the system, i.e., uses statistical methods of data analysis to detect data correlation due to their clustering, classification, detailing.Predictive analytics belongs to the third group and can be used to predict probabilistic events based on previously accumulated information.Predictive analytics methods include machine learning, mathematical statistics, modelling, data mining, etc.The last group, prescriptive analytics, allows to develop the most optimal solution for the situation under consideration based on previously collected data and justify the feasibility of a particular solution, for example, what is more efficient complete replacement or repair of a system element.

Fig. 1. Classification of data analytics types
Thus, predictive analytics belongs to the class of data analysis methods, which is responsible for predicting the behaviours of a system or its individual elements to make timely management decisions to prevent emergency situations.Figure 2 shows an enlarged block diagram of the stages of implementing predictive analytics.At the same time, it should be noted that predictive analytics is now mainly used for industries such as banking and financial services, insurance, the public sector, pharmaceuticals, telecom, and IT.There are examples of the predictive analytics use in the construction industry, but they are rather pioneering in nature, and mainly relate to the operation of technologically sophisticated equipment.The purpose of the article is to analyse the publications presented in the international knowledge base Scopus in order to determine the most promising.

Materials and methods
Figure 3 shows an enlarged scheme of the presented study.

Fig. 3. Enlarged scheme of the study
The main research base is the international knowledge base -Scopus.Based on the purpose of the study, the main selection criteria were keywords, period of publication, branches of knowledge.To ensure the relevance of the study, articles published over the past 5 years were used.

Results
The analysis is carried out on publications presented in the international knowledge base Scopus, selected for the period from 2017 to 2022, by keywords, predictive analytics + construction.Publications were taken from the fields of engineering + computer science.
Figure 4 shows the relationship of the keywords of the sample collected according to the data indicated in the "Materials and Methods" section.The minimum number of matching keywords in the selected publications was set to 15.The threshold value was 82 words out of 7720.For each of the 15 keywords, the program calculated the total strength of simultaneous links with other keywords.
As can be seen from the presented figure, several large clusters can be distinguished within the framework of the use of predictive analytics in construction.The main cluster, highlighted in red, is dedicated to general issues related to predictive analytics in construction, the algorithms used to create predictive models in construction are highlighted here, and there is also a connection with concrete construction.The second clusterhighlighted in green, deals with machine learning issues, the third cluster -blue, is dedicated to the process of creating forecasts.The yellow cluster is dedicated to concrete construction issues.At the next stage, publications were analysed, the number of citations of which exceeds 50, Table 1 presents the results.

Link
Summary of the publication Highlights [1] A review on the use of Riemannian geometry for brain-computer interface and a primer on the classification frameworks based on it are presented in the article Riemannian geometry brain-computer interface [2] The main objective of this article is to evaluate and compare the performance of different machine learning, considering the influence of various training to testing ratios in predicting the soil shear strength, Machine learning, soil mechanics [3] In order to improve the accuracy of project cost forecasting, given the limitations of existing models, a construction cost forecasting model based on SVM (Standard Support Vector Machine) and LSSVM (Least Squares Support Vector Machine) is proposed in the article Justification of the cost of construction / investment [4] The article presents the results of the research work about development the optimum machine learning algorithm for predicting the compressive and flexural strengths of steel fiberreinforced concrete

Concrete strength prediction [5]
The article presents the results of the identification and development of machine learning models to facilitate accurate project delay risk analysis and prediction using objective data sources.
Project risk analysis and prediction [6] The paper presents a practical yet comprehensive implementation of the ensemble methods for prediction of the shear strength for reinforced concrete deep beams with/without web reinforcements.

Strength Prediction of Reinforced
Concrete Beams [7] The article reviews the current state-of-the-art of electric load forecasting technologies and presents recent works pertaining to the combination of different ML algorithms into two or more methods for the construction of hybrid models.
Machine learning electric load forecasting technologies [8] The article presents a data-driven model for digital twin, together with a hybrid model prediction method based on deep learning that creates a prediction technique for enhanced machining tool condition prediction.

Digital twin, deep learning [9]
The article presented study is about prediction of the land use/land cover, seasonal (summer & winter) land surface temperature, and urban thermal field variance index variations using machine learning algorithms in Cumilla City Corporation, Bangladesh.
Machine learning [10] The article presents a data-driven approach to the shear strength of steel fibers in a concrete beam and incorporates the largest database compilation of 507 experimental data.
Strength of steel fibers in a concrete beam [11] The study develops a BIM-based computational tool for building waste analytics and reporting in the construction supply chains.
BIM, building waste analytics [12] The goal of this research is to articulate the big picture ideas and their role in advancing the development of Explainable AI systems.
Explainable AI [13] Full-face tunnel boring machine (TBM) is a modern and efficient tunnel construction equipment.A reliable and accurate TBM performance prediction can reduce the cost and help to select the appropriate construction method.This study introduces a new hybrid intelligence technique, i.e., grey wolf optimizer-feature weighted-multiple kernel-support vector regression (GWO-FW-MKL-SVR) to predict TBM PR.
Tunnel boring machine, predict TBM PR [14] In this paper, authors present a component-based approach that develops machine learning models not only for a parameterized whole building design, but for parameterized components of the design as well.
Machine learning in building design [15] Developments on shear strength (Vf) of steel fiber-reinforced concrete beam simulation have been shifted to the implementation of the computer aid advancements.The current study is attempted to explore new hybrid artificial intelligence (AI) model called integrative support vector regression with firefly optimization algorithm (SVR-FFA) for shear strength prediction of steel fiberreinforced concrete beam.
Artificial intelligence, steel fiber-reinforced concrete beam [16] The real-time acquisition of surrounding rock information is important for the efficient tunneling and hazard prevention in tunnel boring machines (TBMs).This study presents an ensemble learning model based on classification and regression tree and AdaBoost algorithm to predict the classification of surrounding rock mass.
Learning model based, TBMs [17] In this study, attempts are made to review the state-of-the-art BIM applications on design and prefabrication automation of industrialized buildings, with more emphasis on the recent achievement in concrete 3D printing technology.Following this, a BIM method is proposed to support the detailed geometry design and digital fabrication of modular housings.
Review, BIM, design, 3D printing technology [18] The article presents the deep code learning technique and apply it to the Macao intelligent system.
Intelligent system, deep learning [19] The novel interval prediction model based on temporal convolutional networks to forecast wind speed is presented in the article.
Prediction model, forecast wind [20] The real-time air pollution data are remarkably important in controlling air pollution for urban sustainability and protecting humans against the air pollution damages.The article intends to investigate whether and how air pollution using cost effective means and without using the expensive pollution sensors and Predictive model, air pollution facilities can be measured.Predictive model for particulate matter prediction is presented.[21] This study presents an efficient and accurate methodology to estimate the resilient modulus of subgrade soils.
Soils [22] This paper comprehensively reviews the factors influencing the autogenous shrinkage and drying shrinkage of recycled aggregate concrete.
Review, recycled concrete [23] This work tests four different machine learning approachesnamely Support Vector Regression, Random Forest, Artificial Neural Networks, and Extreme Gradient Boosting -on high-density GatorEye UAV-Lidar point clouds for indirect estimation of individual tree dendrometric metrics such as diameter at breast height, total height, and timber volume.
Machine learning, [24] Blasting operations typically have several negative impacts upon human beings and constructions in adjacent region.Among all, airoverpressure (AOp) has been persistently attractive to practitioners and researchers.To control the AOp-induced damage, its strength should be predicted before conducting a blasting operation.This paper analyzes the AOp consequences using the Fuzzy Delphi method (FDM).The method was adopted to identify the key variables with the deepest influence on AOp based on the experts' opinions.
Prediction, airoverpressure [25] In this article probabilistic prediction approach produces a holistic probability distribution over the entire outcome space to quantify the uncertainties related to construction cost predictions Cost predictions [26] This research uses machine learning technique to analyse 16 critical factors and assess the impact of diverse combinations of factors on the performance of predicting the severity of construction accidents.
Machine learning, machine learning [27] This paper provides an enhanced understanding of new opportunities created to optimize operations of highways infrastructure using the recent growth in Big Data analytics and data integration technologies.

Highways infrastructure, Big
Data analytics [28] Shear strength of corroded reinforced concrete (CRC) beams is a key concern in the design and/or retrofit processes for an RC structure during its life cycle.In this paper, the authors developed a machine learning based approach for predicting the residual shear strength of CRC beams at different service times.
Machine learning, reinforced concrete [29] This study evaluates the efficiency of machine learning-based approaches in establishing accurate prediction models for the punching shear strength of flat slabs without transverse reinforcement.
Machine learning, prediction models for the punching shear strength [30] The present work explores adequate regression models, probability distributions and uncertainty variation of the demand models.In addition, the adequacy of several ground motion intensity measures to be used for predictive modelling of local EDPs is investigated.

Predictive model, demand models
The review of publications revealed that the most popular publications mainly concern the issues of substantiating the cost of investments in construction, predicting the properties of reinforced concrete and concrete structures, using information modelling technologies in integration with machine learning models, including as part of the design of capital construction projects, etc.
However, there are no publications considering the use of Industry 4.0 technologies and predictive analytics to ensure the autonomy of capital construction projects.considered when preparing them.

Conclusions
As part of the presented work, a hypothesis is put forward about the possibility of ensuring the autonomy of socially significant elements of urban infrastructure through the integrated use of technologies such as the digital twin, the Internet of things and predictive analytics methods.At the same time, it should be noted that at present it is customary to single out several stages of the maturity of the "digital twin" technology (Fig. 5).

Fig. 5. Maturity Stages of Digital Twin Technology
In the study under the digital twin, a dynamic virtual model of the object is adopted, which allows real-time monitoring of system elements and processes occurring at the facility and can be used to prevent accidents by introducing predictive analytics, where the key element is data as historical data and received data in real time.
In [31] the concept of the autonomy of urban infrastructure socially significant elements is considered in detail.It is noted that this concept is complex and consists of such parameters as reliability, natural-technogenic safety and sustainability.However, each of these parameters is integral and depends on the type of urban infrastructure element under consideration.
In general, the reliability parameter can be represented as follows: In general terms, the reliability of the construction object can be represented as follows: The sustainability of the construction object will take the following form: Natural and technogenic security will be described as follows: Thus, according to (1-3) the autonomy of a construction object can be represented as the following matrix: , (5) where k 1m is the significance of the integral parameter when calculating the autonomy of an object, and m refers to the kind and type of object for which autonomy is calculated.
In this regard, the issue of determining the sufficiency and completeness of the data that needs to be collected and processed to identify critical deviations of the system and ensure the autonomy of socially significant elements of the urban infrastructure by comparing the reference model of the operation of an object or its elements and measurements collected from the system in the mode real time.This issue will be the subject of further research.

Fig. 2 .
Fig. 2. Enlarged block diagram of the stages of predictive analytics implementation.

Fig. 4 .
Fig. 4. The relationship of the sample keywords in the form of a cluster map Thus, according to Figure 4, it can be concluded that today predictive analytics in construction is mainly considered in the field of concrete construction.At the next stage, publications were analysed, the number of citations of which exceeds 50, Table1presents the results.