Data mining to predict failures of communication network devices

. The complexity of telecommunication systems (TCS) and the configuration of communication networks, as well as a large set of data for assessing the state of devices, represent the main vector for the development and improvement of control systems for such communication networks. With the development of information and software technologies, another problem has become the reasonable choice of an apparatus that, taking into account its intellectualization and the characteristics of the initial data set, would correctly calculate and demonstrate the sensitivity of the result to external and internal changes. Modern methods of data processing make it possible to carry out a preventive analysis of the future functioning of TCS devices with flexible adjustment of dependent factors. The choice of one or another method will largely affect the result of the assessment, including its adequacy. And the forecast should also be estimated by this property with a high degree of accuracy. Not every method of data analysis is suitable for a particular data set. So the paper compares the most popular methods of data mining (from autoregression to neural networks) with a conclusion about the compliance with the criteria of working and predicted data. The authors also set out a verbal way of interpreting the predicted data using the example of a sample of failures.


Introduction
The urgency of the problem of collecting and processing data on the functioning of telecommunication systems (TCS) is becoming increasingly important in the modern information environment.Advances in technology and increased demands for data transmission lead to the need to improve the systems responsible for the reliable and secure operation of such systems.
Another task of the work is to form proposals for the implementation of various methods of data collection and processing, such as: Аutoregressive moving-average model (ARMA), Autoregressive Integrated Moving Average with model training (ARIMA), Holt-Winters model (HW), neural network model (NN).These methods will make it possible to more fully and accurately monitor the operation of TCS devices, as well as provide prediction of operational events based on the collected and processed information.
Adequate use of data collection and processing methods in the process of analyzing the collected operational data will improve the reliability and efficiency of the functioning of all TMS.The analysis of the received data will allow identifying problem areas and resourceintensive areas, which, in turn, will open up opportunities for taking prompt measures to optimize network operation and eliminate possible problems.

Characteristics
of telecommunication networks and parameters of their functioning for intellectual analysis Data management technologies include the processes of creating, describing, modifying and using them in processes at the stages of control, evaluation, management, decision making, etc.
The management process begins with the collection of information about the managed object, the information obtained underlies decision making.Data on the state of the object is entered into the information system, accumulated, processed, stored, converted, filtered.
Management processes in the system must meet the principles of adequacy, efficiency, optimality and continuity and, therefore, constantly maintain the relevance, accuracy, consistency of data on the state of the control object.
Data on the functioning of the communication network is a kind of report containing information on the state and performance of the network, as well as data on the technical characteristics of network elements and their deviations from standard values.
The consistency of the interaction between the technology of a telecommunications network and the technology for managing data on its operation is achieved primarily by the formed single control platform, which includes all the necessary modules.And the data as a result of operational processes must go through several stages of processing before a forecast is formed and all influencing factors are taken into account.
The main indicators of the functioning of data transmission networks include:  average delay in the transmission of information packets;  deviation from the average value of the delay in the transmission of data packets;  data packet loss factor;  error rate in data packets, etc.This paper presents the results of modeling the process of analyzing such data, which reflect the implementation of some railway processes.This direction is relevant from the point of view of the safety of transport goods and passengers.And the task of predicting potential faults in telecommunication networks in railway transport will provide advance work on servicing the relevant devices.
The construction of telecommunication networks (TS) based on CarrierEthernet (CE) technology is impossible without the characteristics of individual network elements of the TS, taking into account the functional and technological capabilities of this technology.For high-quality control of the TS, the administrator needs to know a large number of various equipment of the TS CE, process a large amount of information in a timely manner in difficult control conditions.The advantage of modern TS is the availability of automated control processes that allow the use of built-in diagnostic tools for network elements.The generated set of diagnostic parameters of network elements of the TS CE is presented in Table 1.At what speed (per second) does the switch process and transmit information.

(internal) Packet throughput
Measured by the total number of data that was redirected through the ports in the specified amount of time.

Frame delay during transmission
The time that was spent from the moment the frame was transferred to the switch buffer and until it was received on the address port.

Size of the embedded address table
The limit on the number of MAC addresses that can be accommodated by the port mapping table in memory.

Frame buffer size
Frame temporary storage capacity.

Performance internal tires [bps]
Processor performance

Data processing methods for research
To implement the methods of Autoregressive Integrated Moving Average (ARMA), Autoregressive Integrated Moving Average with model learning (ARIMA), the Holt-Winters method (HW) and neural networks (NN), appropriate libraries have been installed that will provide functionality and tools for working with data and implementing these methods (see table ).Choosing the right libraries is an important step, as it will allow you to effectively use the functionality of the methods and simplify their implementation.The implementation itself includes the development of appropriate algorithms, the use of selected libraries for working with data, model training (in the case of ARIMA, HW, NN) and parameter tuning.The purpose of implementing these methods is to analyze time series, identify patterns and trends, and predict future values based on available data.
As a result, the analysis and implementation of these methods allows you to visualize more accurate and reliable forecasts, as well as better understand the dynamics and relationships in the studied time series.This can be useful for making informed decisions, planning and optimizing processes in various areas where time series are applied.The use of these mathematical tools will make it possible to conduct an extensive data analysis, identify trends and patterns, and make accurate and reliable forecasts.The combination of these methods will provide a more complete and accurate representation of the data, which is an important aspect in the process of research and decision-making based on the results obtained.

Interpretation of simulation results
In the process of a comprehensive study of models based on various methods, a single graph was created that combines the results of forecasts.This allowed us to carry out a comprehensive exploratory evaluation and choose the most appropriate method for the forecasting problem for a random data set (in this case, forecasting the number of railway accidents that have occurred).Sliding window autoregressive methods demonstrate their usefulness in forecasting.They reflect a moderate decrease in the number of accidents that occurred during the forecast period.However, it is worth noting that they have some instability in their predictions, especially for a longer-term forecasting horizon.This is because these methods rely on linear relationships and do not capture possible complex non-linear trends in the data.
Neural networks are emerging as a flexible predictive technique that can adapt to complex data dependencies.They show a moderate reduction in the number of accidents that have occurred, similar to autoregressive methods, but with smoother transitions between forecasts.However, it should be noted that neural network predictions can be more prone to noise and fluctuations, which requires careful control and model tuning to achieve the best results.
Based on the analysis of trends and forecast results, the Holt-Winters model stands out as the most preferred method for predicting the number of railway accidents that have occurred.It provides a stable and accurate forecast that matches long-term changes in the data and allows you to make rational decisions based on the predicted values.Conclusions based on the results: 1.ARMA (Аutoregressive moving-average model): The ARMA method, based on the autoregressive mathematical model, uses only the historical values of a variable to predict future values.The ARMA forecast graph shows a gradual increase in the amount of cargo transported over a period of 20 years.The ARMA method is easy to implement and fast to run, but it does not take into account other factors such as seasonality and trends that can affect the data.

ARIMA (Autoregressive Integrated Moving Average with Model Training):
The ARIMA forecast chart shows a similar trend to ARMA, but with some differences.ARIMA can better capture complex trends and dependencies in data than a simple ARMA model.ARIMA requires more computational resources and time to train the model than a simple ARMA model.

HW (Holt-Winters method)
The forecast graph of the Holt-Winters model shows an increase in the number of transported goods at the beginning of the forecast period, and then a decrease at the end of the period.
The Holt-Winters model can be useful when the data has strong seasonal components that need to be taken into account in forecasting.

NN (Neural Networks):
A neural network method using LSTM (Long Short Term Memory) allows you to model complex dependencies and trends in data.
The neural network forecast graph shows a gradual increase in the amount of goods transported over 20 years, similar to ARMA and ARIMA.LSTM can be useful when working with data containing complex dependencies and nonlinear trends.General conclusions: Based on the forecast graphs, we can say that all four methods (ARMA, ARIMA, HW, NN) show similar trends in the forecast of the number of transported goods.The ARMA and ARIMA methods, which are based on autoregression, give smoother and more gradual trends.
The HW method takes into account the seasonality of the data and shows a more complex trend with an increase and then decrease in the number of goods transported.Neural networks using LSTM model complex dependencies in data and give a prediction similar to ARMA and ARIMA.
The choice of the most successful method depends on the characteristics of the data and the goals of forecasting.If seasonality is important, then HW may be preferred.If more complex dependencies and trends need to be taken into account, then ARIMA and NN may be more suitable methods.ARIMA and NN require more computational resources and time to train the model, but can provide more accurate predictions in complex scenarios.

Proposals for the application of the results of data processing
Further research focused on proposals for the application of the developed models in relation to the functioning of the transport communication network.The simulation results showed that with adequate use of the Data Set, it is possible to simulate the traffic of communication networks, predict equipment failures and other events.
Studies were also conducted to evaluate the sensitivity of each of the developed models to external factors and features of data sets.This made it possible to make suggestions on the choice of the most appropriate model for specific scenarios and conditions.
The obtained models and proposals for their application in the process of data analysis of TCS devices can significantly increase the efficiency and reliability of the intelligent data processing system.Additional studies and experiments will allow more detailed interpretation of the results of modeling and forecasting, as well as adapting models to specific conditions and system requirements.

Conclusions
The studied forecasting methods make it possible to improve the reliability and stability of equipment.Forecasting allows you to anticipate potential problems, which allows you to take appropriate measures in advance.
The chosen prediction methods help to reduce the number of potential failures.They allow you to identify hidden trends, seasonality and other factors that can lead to failures or problems in the system.With accurate forecasts, preventive action can be taken and preventive work planned to reduce the risk of failures to ensure uninterrupted system operation.
The use of the studied methods in the process of processing and analyzing data on the functioning of telecommunication networks will make it possible to control the transmission of traffic more efficiently.On this basis, the demand for resources, network load and volumes

Fig. 1 .
Fig. 1.Comparison of methods for predicting the number of accidents that have occurred involving railway transportAnalyzing the presented graph, which clearly demonstrates the trends and forecasts obtained using various methods, it is possible to establish: 1. ARMA: This method uses a mathematical approach and is based on an autoregressive model that predicts future values based only on the previous values of the time series.The forecast obtained using AR shows a gradual increase in the number of accidents over time.2. ARIMA: Unlike ARMA, this method uses model training to make predictions.The sliding window is used to create training examples, and then the model is trained on them.The ARMA based forecast shows a smoother trend with little fluctuation considering more complex relationships and trend changes in the data.3. HW: This model is used to predict the trend and seasonality of the data.The forecast based on the Holt-Winters model shows a decrease in the number of accidents over time, moreover, the model is able to capture and take into account long-term trends and changes in the data.4. NN: The code uses a neural network model with an LSTM layer to predict the number of accidents.A forecast made using neural networks can capture complex time dependencies and trend changes in the data.In this case, the predicted values show a decrease in the number of accidents over time.Forecast trends: ARMA: Gradual increase in accidents over time.ARIMA: A smoother trend with little fluctuation, reflecting more complex relationships and trend changes in the data.HW: Reducing the number of accidents over time, given the trend and seasonal patterns.NN: Decreased crashes over time, which may indicate the detection of some complex time patterns and trend changes.Sliding window autoregressive methods demonstrate their usefulness in forecasting.They reflect a moderate decrease in the number of accidents that occurred during the forecast period.However, it is worth noting that they have some instability in their predictions, especially for a longer-term forecasting horizon.This is because these methods rely on linear relationships and do not capture possible complex non-linear trends in the data.

Fig. 2 .
Fig. 2. Comparison of methods for predicting the amount of goods transported by rail Another Data Set was used to predict the amount of goods transported by rail.The same four prediction methods were applied in the code: ARMA (sliding window autoregression, math model), ARIMA (sliding window autoregression with model training), Holt-Winters (HW) model, and neural networks (NN).Conclusions based on the results: 1.ARMA (Аutoregressive moving-average model):The ARMA method, based on the autoregressive mathematical model, uses only the historical values of a variable to predict future values.The ARMA forecast graph shows a gradual increase in the amount of cargo transported over a period of 20 years.The ARMA method is easy to implement and fast to run, but it does not take into account other factors such as seasonality and trends that can affect the data.2.ARIMA (Autoregressive Integrated Moving Average with Model Training):The ARIMA forecast chart shows a similar trend to ARMA, but with some differences.ARIMA can better capture complex trends and dependencies in data than a simple ARMA model.ARIMA requires more computational resources and time to train the model than a simple ARMA model.3.HW (Holt-Winters method)The forecast graph of the Holt-Winters model shows an increase in the number of transported goods at the beginning of the forecast period, and then a decrease at the end of the period.The Holt-Winters model can be useful when the data has strong seasonal components that need to be taken into account in forecasting.4. NN (Neural Networks):

Table 1 .
Complex of diagnostic parameters of network elements TrS CE