Artificial intelligence methods to control the energy efficiency of electric rolling stock online

The international practices in organizing the energy consumption control of electric rolling stock are analyzed. As a result, it was concluded that currently the issue of organizing the energy consumption control of electric rolling stock is mainly solved by using analytical methods. These methods are based on designing the simulation models, which are usually based on the Pontryagin maximum principle. However, considering the development of recording systems for motion parameters of electric rolling stock, as well as other automated systems of Russian Railways, it seems promising to develop and study artificial intelligence methods and algorithms for solving real-time monitoring issues of electric rolling stock energy consumption. It was also determined that the most modern motion parameter recorders have a number of significant drawbacks from the data analysis point of view. Such drawbacks include insufficient data and their low reliability, lack of linking the recorded data to trips and locomotive teams, the impossibility of choosing a constant interval for recording measurement results. Moreover, there is also high probability of errors when recording data on the cartridge, lack of GPS/GLONASS satellite navigation system, lack of wireless data transmission, imperfection of software and inconvenience of exporting data from a cartridge file and its incompleteness. In order to test the energy efficiency assessment of electric rolling stock within the limits of arbitrary energy tracking areas, the corresponding software was developed on the basis of data from the motion parameters recorders. However, developing the new complex automated system is required for the full implementation of the proposed consumption tracking method. Such system should combine the entire set of measured parameters, both for electric rolling stock and for the traction power supply system.


Introduction
Russian Railways is one of the largest consumers of energy resources, which accounts for about 4.4% of all electricity generated in Russia. 85% of that energy goes for traction of trains, which is why one of the priorities of the Russian Railways energy strategy till 2020 and for the long term up to 2030 (developed in accordance with the Strategy for Scientific and Technological Development of the Russian Federation) is a significant increase in the energy efficiency of train traction. Thus, in general for Russian Railways, the predicted decrease in the specific consumption of fuel and energy resources for train traction should be from 2.5 to 4.4% by 2020 comparing to the level of 2015, and from 8.0 to 9.0% by 2030. This can be achieved, in particular, due to a qualitative improvement in the structure of managing the consumption of fuel and energy resources through the use of modern information technologies, tracking systems, regulation and monitoring the energy consumption.
Currently, the Russian Railways operates with various automated software systems that, in particular, solve the issues of organizing control of electricity consumption for train traction. One of such systems is the centralized route processing for drivers system (CRPD). The basis of this system is the manual processing of data contained in the running schedules of the drivers, including electronic ones, and their further processing by statistical methods of mathematical statistics. However, analyzing the sources [1][2][3] showed that the methods and approaches for organizing energy consumption control for train traction based on CRPD data have the following disadvantages: a) The energy consumption of the rolling stock is analyzed according to data on the routes of drivers, which indicates only the total consumption (return) of ride energy. The limited information available in the driver's schedule does not allow to complexly analyze the energy consumption of electric rolling stock for a separate section of the track at a certain point in time, as well as to determine the amount of unproductive electricity losses. b) Incorrect filling in the relevant sections of the running schedule, or even the lack of data on the energy consumption for train traction and returning it to the contact network by means of recovery. This leads to distorted reporting data and to underestimating the energy resources of Russian Railways. Such cases are far from singular regarding operation. c) Untimely delivery of running schedules violates the control of energy consumption and return, and also impedes the rational planning of technical and economic indicators of locomotives and locomotive teams.
At present, JSC "Railway Research Institute" specialists under the leadership of Professor L. Muginshtein have introduced a hardware progammed system for managing the transportation process at large landfills. The system is based on intelligent systems implementing medium-term and operational calculation of energy-saving passenger and freight train schedules, as well as operational coordination of technological processes for schedule ensuring. Within the framework of this complex, a method, among others, has been developed and practically implemented for calculating the energy-optimal train trajectories along sections of railways at given travel times. This method takes into account the track profile, masses and formation patterns of trains and their speed limits. The method is based according to the optimal control theory developed by academician L. S. Pontryagin, supplemented by original computational algorithms that were implemented as PC programs [3][4][5][6][7][8][9][10][11]. However, despite the considerable number of advantages of this complex, a following number of significant drawbacks limiting the further increase in the energy efficiency of train traction should be notes: -the software solution implies the manual input of information from the routes of drivers, therefore, all the above mentioned disadvantages of the CPRD system are also valid fot this software package.
-the energy consumption for the locomotive's own needs is determined according to the standards established by the Rules for Traction Calculations (RTC). Firstly, these standards are given in the RTC for average network conditions, which does not reflect the actual consumption regarding the acual train situation. Secondly, additional studies are required in case of introducing the new series in operation.
-the energy consumption for carrying out technological operations during the ride is determined if the driver has certain data. Such data includes single ride of a locomotive, performing maneuvers, demurrage while waiting for operation, schedule time make, temporary speed limits, non-scheduled stops, stops at prohibiting traffic signals considering the operational standards that are designed for the average network conditions. All the mentioned above does not allow estimating the unproductive energy losses real train situation.
-there is no option for taking into account factors that cannot be calculated (accuracy and malfunction of consumption tracking devices, significant deviations from rational operating modes, the experience and skill of drivers, special weather conditions, the technical state of the track and rolling stock, etc.); -a system of correction factors is used, which are obtained as a result of trial rides and have a large spread; -the calculation is performed for a single train, which results into a rather timeconsuming process in the case of modeling a large number of rides; -the need for periodic revision of correction factors in order to bring the norms closer to those actually being implemented.
Scientists at Rostov State Transport University are conducting computational experiments with simulation models. For example, they have proposed a metaheuristic algorithm for optimizing driving modes using stochastic optimization ideas (in particular, multistart, Monte Carlo procedures and simulated stress relieving). This algorithm is implemented in MATLAB and allows solving the optimization problem for train riding modes for given initial data on the track section (track profile, speed limits, etc.), rolling stock (weight, length and train formation) and running schedule [12][13][14].
Under the guidance of a professor I.M. Kokurin at the Solomenko Institute of Transport Problems of the Russian academy of sciences, an automated system was developed for the data output on the running time. This system is supplemented by calculations of energy-saving driving modes for freight trains performed by the ERA traction calculation module (examination, calculations, analysis). The module was created by specialists of the Far Eastern State Transport University (Khabarovsk) [15][16][17][18]. The simulation system plots forecast graphs of each train, designing energy-efficient lines of freight trains in the intervals between passenger, suburban and other trains that are scheduled.
European automated systems for monitoring energy consumption by electric rolling stock ERESS and TEMA can significantly increase the efficiency of obtaining information on energy consumption by rolling stock along the entire route. They also allow identifying reserves for reducing energy consumption based on more detailed information, and eliminate the human factor when reporting. However, these systems are manufactured in accordance with European Union standards and their primary purpose is billing the private business operators in the various tariff areas [19][20][21][22][23]. In this regard, these systems cannot fully function under the operating conditions of Russian Railways and require substantial revision.
The computerized approach for controlling the energy consumption by electric rolling stock is known [24]. The approach concept implies that data on the current values of energy consumption by the vehicle is received and stored in the memory of the storage device. This data is obtained from the device for measuring electric energy on a vehicle with a certain time interval. Then, based on the measured data, the energy consumption in a separate section of the path is determined. In this case, the position of the transport is determined by means of the GPS system, Galileo, passenger information system, by means of driver's manual entry or by leading marks. Then, comparative data is extracted from the data bank, which indicate the energy consumption by the vehicle on previous trips on this track section. The comparison result, depending on the sign, characterizes either the energy saving level during the ride, or energy loss in comparison with previous rides. This method is used for assessing the efficiency of driving a vehicle on a section by various drivers. However, it cannot be used for determining the energy consumption of technological operations (demurrage, schedule time make, temporary speed limits, non-scheduled stops, stops for prohibiting traffic signals), since the boundaries of the sections covering braking, demurrage, its pulling off and acceleration are not specified and not known in advance.
In Italy, scientists developed an algorithm for a mixed integer linear problem of train motion modes (TDRC-MILP algorithm). It is used to minimize energy consumption by electric rolling stock in real time when the route and schedule of trains are automatically set. However, this algorithm was tested only on a simulation model without taking into account the actual operating conditions, where the slope, curvature and speed limits vary from one track section to another [25,26].
In Hungary, scientists carried out research based on the statistical processing of data from the on-board train tracking systems MÁV on the driving mode's impact on energy consumption. As a result of the study, it was found that selecting the various train driving modes by various drivers in the same operating conditions significantly affects the energy consumption of electric rolling stock. In a continuation of the research topic, the authors propose using data from remote railway telemonitoring systems, which will allow developing methods for assessing, controlling or predicting the use of electricity, and, as a result, to increase the energy efficiency of electric rolling stock [27].
In Spain, scientists have developed a simulation model for calculating the speed profiles of Valencia metro vehicles to minimize energy consumption. The proposed model calculates various commands for the systematic execution by drivers. The resulting simulator was adjusted using on-board measurements of speed, acceleration and energy. However, it is worth noting that this simulation model is valid only for the electric rolling stock of the metro and is unsuitable for modeling the motion modes of heavy and long trains of trunk railways [28].
In China, scientists have developed a simulation model for energy-efficient multi-train control. Unlike well-known analytical methods, this approach avoids some excessive simplifications, such as a constant track profile, continuous control and single train simulation. The simulation results showed that energy-efficient control methods for several trains allow avoiding unnecessary braking and reducing energy consumption [29].
In Australia, scientists have developed the Energymiser ⃝ R device, which is used on freight and passenger trains in Australia and the UK to provide drivers with tips on selecting energy-efficient train driving modes. Energymiser ⃝ R uses a specialized numerical algorithm to find the optimal switching points for each steep section of the railway. For each sustained speed, a perturbation analysis was used. This was in order to prove that the convexity of the local energy functional is sufficient enough to guarantee uniquely determined optimal switching points for each steep section of the track [30].
In Holland research [31], scientists note that the Pontryagin maximum principle is intensively applied to derive optimal driving modes that make up the optimal energy-efficient strategy for driving a train in various conditions. Nevertheless, the optimal sequence and points of switching the optimal driving modes are generally not trivial. This has led to a wide range of optimization models and algorithms for calculating optimal railway trajectories and, more recently, to their use for optimizing schedules with a balance between energy efficiency and ride duration.
It should be noted that currently the issue of organizing the control of electricity consumption by electric rolling stock is mainly solved by using analytical methods. They are based on the designing of simulation models, which, in turn, are usually based on the Pontryagin maximum principle. However, considering the development of systems for recording motion parameters on electric rolling stock, as well as other automated systems of Russian Railways, it seems promising to develop and study artificial intelligence methods and algorithms for solving real-time energy consumption monitoring of electric rolling stock. Currently, these issues are not studied enough, so conducting research on the stated topic should give an impulse to the development of this research area.

Object and methods of research
The object of research is modern 2ES6 DC electric locomotives equipped with motion parameters recorders that are part of the microprocessor motion control system (MPMCS). As a result of decoding the data of the motion parameters recorders, the energy efficiency of the 2ES6 electric locomotives was estimated for the section with a flat track profile within the limits of arbitrary energy tracking areas. The correlation-regression analysis method was used for assessing mathematical statistics.

Results
Motion parameters recorders of the microprocessor locomotive control system (MPLCS MPR) are among the most advanced ones. They are produced by NPO SAUT and have a number of following significant drawbacks from the data analysis point of view:  insufficient data and its low reliability;  lack of recorded data reference to rides and locomotive crews;  inability to select a constant interval for recording measurement results;  high probability of errors when writing data to the cartridge;  lack of GPS/GLONASS satellite navigation system;  lack of wireless data transmission;  software imperfection;  inconvenience of exporting data from the cartridge file and its incompleteness.
MPLCS MPR does not record the values of energy consumed for own needs, for traction and return of energy to the contact network. The information from the electric meter is extremely unreliable and often changes to an arbitrary value for extremely short time intervals.
Data is recorded by the system continuously, rides are not separated in any way, information about the train number is not recorded. That forces user to manually look for sections of the record that correspond to rides on the motion speed schedule. However, an exact definition of the boundaries of the ride is impossible.
The inability to select a constant interval for recording the measurement results results in data being recorded at various intervals, presumably simultaneous with changes in the electrical circuits. Adjacent records can be separated by a time interval of several milliseconds or several minutes. For this reason, the data from the cartridges can be characterized as insufficiently complete and requiring additional processing.
Errors occuring during recording lead to the presence of a large number of records that do not carry useful information. Absurd parameter values and excessively rapid changes in the values force the recorded MPR information to be discarded as unreliable.
Due to the lack of a GPS/GLONASS satellite navigation system, the speed and coordinate of the route are determined by the wheelset tire. Since its diameter can vary significantly during operation, the recorded data is not accurate enough and can differ significantly from the actual one.
The lack of wireless data transfer makes impossible analyzing recorded information in real time, necessitates additional procedures for transferring it from cartridges to a personal computer, which creates the risk of damage or loss.
The inconvenience and low functionality of the decryption software for cartridge data make it impossible to analyze it directly and create the need to export data from cartridges for their subsequent processing. For example, fixed data is split into an arbitrary number of fragments by an unknown algorithm. Many of these data do not carry useful information, which significantly complicates data analysis.
The data export procedure is time consuming. As a result of export, the MPLCS MPR program generates a text file containing many (tens of thousands) lines with information about the locomotive state. Without additional processing, analysis of the exported data is not possible.
Due to incompleteness and inaccuracy, the information received from MPLCS MPR does not allow the following:  to determine the exact time of the beginning and end of the ride and its parameters;  to identify the locomotive team that completed the ride;  to accurately determine the coordinate of the route and speed;  to determine the crossing time for the boundaries of areas;  to determine the amount of energy consumed and returned to the network.
The above mentioned disadvantages allow directly analyzing the energy efficiency of rides, which led to the need for developing a technology for processing data from motion data recorders.
In order to determine the values of electricity consumption and return within the limts of arbitrary tracking ares, one needs to have the following information:  train and locomotive numbers;  rail transport weight;  axle load;  ambient temperature;  extent of areas;  crossing time for the borders of areas;  the amount of electricity consumed by traction;  the amount of electricity consumed for own needs;  the amount of electricity returned to the contact network.
Using the data from the MPLCS MPR cartridge files, the locomotive number can be determined, as well as the values of electric energy consumed for traction and own needs can be determined after the conversions, the recovered energy also. It is impossible to obtain the rest of the information with data from the cartridges only.
Part of the required information was obtained from an automated system for maintaining and analyzing the train sheet (URAL-VNIIZHT TS). This system allows visually obtaining the following necessary information:  train and locomotive numbers;  time of the beginning and end of the ride;  time of crossing the tracking zones areas;  axle load;  rail transport weight.
Comparing the information from the MPLCS MPR and the TS system, the MPLCS MPR cartridge file can be manually found, which contains the required ride, and the number of the fragment containing it can be determined. Such operation is rather laborious and cannot be automated or simplified due to imperfection of the software.
The temperature data was obtained from the annual report of a weather station located at a minimum distance from the studied railway section. In case of available MPLCS MPR data, data from the TS system, reports of weather stations and data on the length of tracking areas, it becomes possible to determine the consumption and return of energy within the boundaries of arbitrary tracking areas.
This process is extremely time-consuming, as a result of which the need for its automation appeared. It was implemented by using procedures in the VisualBasicforApplications (VBA) language built into the MicrosoftExcel table processor. The selection of this environment is caused by the high prevalence of the MicrosoftOffice software package both in Russian Railways and in educational institutions, and by the relative simplicity of the VBA language. In this case, there is no need to install additional programs for working with macros (if you have MS Excel).
Since the process cannot be fully automated due to the imperfection of the MPLCS MPR software and due to the need for collecting additional information, there is a need for manual data preparation for subsequent processing, namely:  searching for the required fragment of the file from the MPLCS MPR cartridges based on information from TS system and its export for each section;  transferring information from the TS system to the MicrosoftExcel table;  transferring the information about the length of hauls to the MicrosoftExcel table;  receiving the weather station reports from open sources;  transferring information from weather station reports to a MicrosoftExcel table by using a special sub-program. As a result of these procedures, an MS Excel book is formed containing all the necessary data about the studied area, which makes them accessible for the program and visual for the person.
The algorithm of the program in a simplified form is presented in Figure 1. Let us consider the main stages of the program in more detail. At the first stage, the program reads data from a manually created MS Excel workbook containing all the necessary information about the studied section: data on the train schedule for the entire period, temperature values for the entire period, names and lengths of tracking zones.
At the second stage, the program reads information from two text files with data exported from the MPLCS MPR program. One file corresponds to one locomotive section.
At the third stage, faulty records are corrected and data converted. Instead of a date, some records of exported data contain a meaningless set of digits "00.00.2000", which the program corrects for the correct date, determined by the nearest date and time. Also, separate values of the recording date and time are converted into a single value of the "Date" type containing information about the date and time and rounding its value to whole seconds. Excess information is discarded.
At the fourth stage, each of the data arrays is assigned to the corresponding section of the locomotive.
At the fifth stage, the program finds the corresponding ride in the workbook with information about the section, according to the time limits of the MPLCS MPR records. After that the program reads all the following necessary information: train number, rail transport weight, axle load, temperature on the ride day, and riunning schedule data.
At the sixth step, all entries made earlier than the beginning or later than the end of the ride are discarded. This allows significantly reducing the amount of processed data and speeding up the program.
At the seventh stage, the program converts the records using linear interpolation so that the time interval between adjacent records is one second:  several records made in the same second are converted into one record with averaged parameter values;  records with an interval of more than a second are supplemented, the parameters are calculated using linear interpolation;  recordings with 1 second intervals remain unchanged.
At the eighth stage, redundant information is discarded to the nearest second. At the ninth stage, the energies of the sections and of the locomotive as a whole are calculated. The effect of this section of the program on the recorded parameters is illustrated in Figure 2.
where . is the average current of the first section; -overhead system voltage applied to the section; 12 -current of the first and second engines; 34 -current of the third and fourth engines; -coefficient taking into account the type of connection of traction engines, k = 1 for positions up to and including 44, k = 2 for positions over 44.
Energy consumption by first electric locomotive section for own needs per second, kW·h: where А is the current for own needs of the first section. Energy consumption by second electric locomotive section for own needs per second, kW·h: where is the current for own needs of the second section. Total energy consumption, kW·h: If the total energy consumption is above zero, then the electric locomotive consumes energy. If it is below zero, then it returns to the network. For convenience, the energy consumption is duplicated by columns with the values consumed for traction and returned to the network by energies separately. An example of the data obtained as a result of the work of this program section is presented in table 1. At the tenth stage, the general data array is divided into areas according to the time of crossing their borders, determined according to the running schedule data.
At the eleventh stage, a set of following values is determined: ride time, the total energy value consumed by traction, the total energy value returned to the network, total value of energy consumed for own needs, specific power consumption, specific recovery, average speed according to MPLCS MPR, average speed according to the TS system. Specific energy consumption, kW·h/ch: where mt -railway train weight; l -section lenght. Specific energy recovery, kW·h/ch: At the twelfth stage, the received data is drawn up in the form of a reporting table and exported to a separate MS Excel workbook, which contains or will contain reports on all rides. After completing this step, part of the program for processing data from motion parameter recorders, which is responsible for determining the energy consuption and return within the limits of arbitrary tracking areas, finishes its work.
In order to assess the energy efficiency of a ride, it is necessary to determine the consumption rate for each of the considered zones. The probabilistic-statistical method was used.
To determine the consumption rate for each ride, taking into account such factors as the railway train weight, axle load and ambient temperature, the equation of multiple nonlinear regression was formulated as follows: where mt -railway train weight; -axle load; -ambient temperature. The coefficients 0 , 1 and 2 are determined according to the data from multiple trips.
Since the regression analysis procedure is complex and time-consuming, it was also automated using the data analysis package built into the MS Excel spreadsheet processor and an additional sub-program, the algorithm of which is presented in a simplified form in Figure  3. At the second stage, the values 1/q and 1/mc are calculated for each ride. At the third stage, the program distributes the data by tracking areas. As a result, a lot of data arrays are formed containing information about all rides within only one tracking area.
At the fourth stage, the program performs regression analysis using the built-in data analysis package. As a result, the coefficients 0 , 1 and 2 are determined, as well as the normal specific energy consumption for each ride within each of the tracking areas.
At the fifth stage, the program calculates energy consumption rate for the area passage and the discrepancy coefficient for each ride within each of the tracking areas.
Normal value of consumed energy, kW•h: Discrepancy coefficient, %: where min( ; ) is the smallest of the two values. The discrepancy coefficient shows the relative deviation of the actual parameters from the predicted ones. Rides with a deviation of more than 30% should undergo a more detailed study at the depot, as they are the evidence of either a malfunction of the motion parameters recorders, or of a significant waste of energy.
At the sixth stage, the final reporting table is formed, which contains the complete information about each ride within each of the tracking areas and is displayed on the corresponding sheet of the MS Excel workbook. One sheet corresponds to one tracking area.
After the sixth step, the part of the program responsible for evaluating the energy efficiency of rides is completing its work.
In order to test the algorithm for assessing energy efficiency, more than 200 rides were analyzed, made by electromotives of the 2ES6 series on a section of the West Siberian Railway with a length of 316 kilometers. Based on the energy consumption data obtained for each of the tracking areas, the regression analysis was performed to determine the standards of specific and absolute energy consumption, as well as the discrepancy coefficient. The equations and correlation coefficients obtained as a result of regression analysis are presented in table 2.
As an example, let us consider a trip made by train No.2135 (locomotive No.225) with a mass of 7907 tons (axle load of 22.99 tons) along the route Barabinsk-Moskovka. The data obtained as a result of the algorithm are presented in Figure 4. As a result of the analysis, it was found that the actual energy consumption differs from the predicted by 7%, i.e. the ride was carried out with a slight excess of the standard. The largest energy waste was recorded in the area of Karachi-Chany lake. A detailed analysis of the files from the MPLCS MPR cartridges revealed that there were no non-scheduled stops and temporary speed restrictions in the area. Based on this, it can be assumed that the energy was wasted due to the driver choosing irrational driving modes for this area or due to the external factors impact, such as weather conditions, temporary deterioration of the track condition (oil film on the surface of the rail head, etc.). An exact determination of the reason for overspending is impossible due to the incompleteness of the information available for analysis.

Conclusions
The developed programs allow analyzing data from the motion parameters recorders in semiautomatic mode, in order to determine the consumption and return of energy within the limits of arbitrary tracking areas, as well as to determine the standards of specific and absolute energy consumption for each ride within these areas. The results of the analysis are displayed in the form of visual and informative reporting tables. It is also possible to generate detailed ride reports with second-by-second records of the main parameters for each section separately and for the locomotive as a whole.
A significant drawback is the need for preliminary collection and manual processing of part of the information caused by the imperfection of the motion parameters used at 2ES6 motion recorders and the software developed for them.
Based on the information received, it is possible to identify trips with low energy efficiency and determine the most unfavorable areas in terms of energy consumption.
However, for the full implementation of the proposed consumption control method, it will be necessary to create a comprehensive automated system that combines the entire set of measured parameters, both on electric rolling stock and in the traction power supply system.