Prediction model for the maintenance of rail infrastructure in Java

. Maintenance is the most prolonged phase after constructing a railway track is completed and operated. The initial indication of the need for railway track maintenance can be seen from the track quality index (TQI) value. Maintenance of railway tracks can be based on the TQI data category, which is a track superelevation, leveling, lining, and gauge width with TQI categories ranging from very good, good, fair, and poor. In the existing condition, only one-track measurement train, the EM-120, is owned by PT Kereta Api Indonesia (Persero) operates on the island of Java, so there are still railway tracks that still need to be measured by track measurement trains and require a TQI value. Implementing the TQI categorization is necessary for maintenance; hence a comprehensive study is essential to monitor and track the advancement of the research. This paper will map the literature on railway track maintenance, TQI, and prediction models. The literature database was taken from Google Scholar and analyzed using the VOS viewer tool with a mapping of previous research. The results of this research are highly useful in understanding the current development of railway track maintenance research; however, a study has yet to be identified that predicts the TQI category for railway tracks that have not been surveyed by track measurement trains.


Introduction
Maintenance is the most extended phase after construction is completed and put into operation.The initial indication of a railroad line requiring maintenance can be seen from the quality value of the railroad.Railway maintenance can be recommended based on Track Quality Index (TQI) category data.The existing number of railroads measuring trains operating on the island of Java is only one unit, namely EM-120, owned by PT Kereta Api Indonesia (Persero).The railroad track measured by the railroad measuring train every year has decreased the length of the measured railroad track.In 2019 the percentage of railroad track measured was 66.90%, while 33.10% was not estimated [1].To overcome these problems, it is necessary to do predicting on railroad lines that are not measured by railroad-measuring trains to obtain TQI data so that maintenance proposals can be delivered.
Maintenance planning requires data and information from the field on the condition of the infrastructure to determine what maintenance is conducted on the treatment.The TQI is an early indication of damage or a decrease in the feasibility of railroad geometry and sub-structures.TQI data on railroad geometrics in Indonesian railways is used as the basis for maintenance work and accident investigations [2].
The operation and maintenance phase of the railroad has been conducted research that resulted in the concept of sustainable geomechanical maintenance.Among them are related to railroad wear on railroad wheel friction conducted by Berggren et al. [3].The Article modeled railroad wear due to railroad wheel friction to minimize maintenance costs by Caetano & Teixeira, [4] developed a railroad geometric degradation model that considers the accuracy of predictions in supporting maintenance decisions, [5] measured the railroad quality index on the railroad body structure using ground penetrating radar, [6] modeled the prediction of railroad degradation based on railroad geometric measurements to determine improvements to the railroad sub-structure, [7] developed an artificial neural assessment of railroad conditions to support decision making in railroad maintenance and repair strategies, [2] developed a comparative study of calculating railroad quality values from the TQI formula from several countries.Further research development is still needed to predict the maintenance category of railway TQI on railroad lines that are not measured by the EM120 series track measuring trains, which includes the measurement of track height, alignment, twist, and gauge, as well as devices or clusters for crossings, switches, bridges, tangen, and curves.
In the research conducted from 1998 to 2021, there still needs to be gaps in previous research on the TQI in the railroad infrastructure maintenance category.To map the scientific boundaries and bridge the problems in this research with a scientific mapping approach.Scientific mapping aims to create a bibliometric map that is very useful for understanding the development of literature about the development of the science studied [8].Scientific mapping in this paper uses bibliometric data from google scholar metadata analyzed with VOS viewers software.This paper will analyze and discuss the development of scientific publications, including the number, source, type of publications, number of citations, author contributions, and research subjects similar to the research topic.

Research methods
This study review published research on rail infrastructure maintenance prediction models on railroads not measured by railroad gauge trains obtained from Scopus metadata from 1998 to 2021.The research utilizes scientific mapping methods and VOSviewers software.This article needs to provide a detailed analysis of all available studies.Quantitatively this article summarizes the existing literature conditions and trends in the development of literature on railroad infrastructure maintenance prediction models so that this study can provide readers with a systematic understanding of the development of existing publications, the type of publication, the contribution of the country of origin of the publication, the source, the development of the number of citations, the contribution of the author, the field of research and the field of study.This study found a gap in the literature discussing the prediction model of railroad infrastructure maintenance.Starting from the gap, the formulation of the research objectives will then be carried out.This research aims to conduct scientific mapping of railroad infrastructure maintenance prediction models.The next step is to collect literature metadata from selected bibliographic databases (from Scopus).Then the following step is to analyze using VOSviewer software and then discuss the findings.Finally, conclusions were drawn from the findings.The stages of this research began by identifying existing research gaps in the railroad infrastructure maintenance prediction model.

Database selection
In this paper, bibliographic data in CSV format taken from the Scopus database was used.To assist the review, keywords the title, abstract, and database literature keywords consisting of "model and maintenance", 'railroad," and "quality" were used.This paper also limits the research subject to engineering, computer science, materials science, and mathematics.

Science mapping
Using the scientific mapping method, a good map of the development of science can be drawn.This scientific mapping can change and develop along with the development of science itself.The scientific mapping method shows the structural and dynamic aspects of scientific research, and it is a spatial representation of how disciplines, fields, and authors are interrelated [9].The established field of visual analytics can provide a promising direction to pursue.Visual analytics can be seen as the second generation of information visualization [8].
The science mapping process has three conceptual steps to produce a science map [8].These steps include selecting units of analysis consisting of the basic particles of science under review, determining the size of the relationship between units, and the relationship being reviewed in a low-dimensional space (generally in two dimensions).More and more science mappings applications, such as Harzing, VOSviewer, Open Knowledge Maps, Connected Papers, Answerthepublic, and Raxter, can be used today.This research will use VOSviewers software to build a scientific network of publications, scientific journals, researchers, research organizations, countries, keywords, and terms [10].VOSviewer can also create maps based on network data and describe maps in network maps, overlay maps, and density maps.VOSviewers in the scientific mapping process are considered very capably and widely used in Scopus metadata.

Research development of rail infrastructure maintenance prediction model
The development of publications in Fig. 1 is shown starting from the beginning of publications related to the prediction model of railroad infrastructure maintenance from 1998 to 2021.Based on Scopus data, the results identified 261 documents, including 69 open-access documents and 192 closed-access documents.Publications related to the rail infrastructure maintenance prediction model first appeared in 1998, consisting of the computerization of rail vehicle maintenance systems [11] and scheduling of rail track maintenance [12].Between 1998 and 2003, the number of publications was still minimal, with less than 20 documents.During this period, researchers were still focusing on the initial challenges in dealing with railroad maintenance, maintenance schedules, maintenance costs, and facility maintenance.Over time, publications that discuss the maintenance of railroad infrastructure based on the post-construction TQI have gradually grown.This can be seen from the number of publications that grew rapidly from 2005 to 2019 but fell back in 2020 and began to creep up slightly in 2021.This shows the longest phase in the maintenance of railroad infrastructure that gets the attention of researchers to conduct studies on railroad infrastructure maintenance.

Top 10 research outputs
Referring to the Scopus metadata from 1998 to 2021 of the 261 publication documents found, journal publications and conference proceedings contributed the most (92.33%) compared to other publications.They contributed greatly to the development of this research.Based on Table 1, journals (157 publications) contributed 60.15%, conference proceedings (84 publications) contributed 32.18%, book series (15 publications) contributed 5.57%, trade journals (3 publications) contributed 1.15%, and books contributed (2 publications) contributed 0.77%.This shows that publications through journals and conference proceedings are the main reference for researchers in conveying the results of thoughts, opinions, and studies on this topic.
Based on Scopus metadata, there are 73 publications from 1998 to 2021.Based on Fig. 2   Based on the explanation above, the Journal of Rail and rapid transit is the highest output for researchers to publish their articles.While the transportation research record is the largest output that publishes articles correlating with the maintenance of railroad infrastructure.

Influential countries
Based on the Scopus dataset, 43 countries contributed from 1998-2021.Mapping productivity based on the contribution of these countries using the co-authorship analysis method and full counting calculation with a minimum number of publications of 5.The results obtained are 16 countries with 6 clusters of 43 countries that meet these requirements.Cluster 1 comprises Australia, China, Japan, and the United Kingdom, and Cluster 2 comprises Austria.Italy and Switzerland, cluster 3 consist of Canada, France, and the United States, cluster 4 consists of Denmark and the Netherlands, cluster 5 consists of Portugal and Spain, and Cluster 6 consists of Germany and Sweden.The results of cluster visualization are shown in Fig. 3.Where the size of the node represents the size of a country's contribution.The number of scientific papers published can usually be said to be an indicator of the development of a discipline [13], and changes in the number of scientific papers published can show the development or change of science [8].
Measuring the number of citations based on Fig. 3, for the 5 countries with the largest contribution, the 3 publications with the most citations in each country are as follows: In China, research on sustainability-based lifecycle management for railway turnout systems [14] has 97 citations, then research on slab tracks under hightemperature conditions [15] has 64 citations, then railroad maintenance, model and algorithm for integrating train timetabling and track maintenance task scheduling [16] has 58 citations.United Kingdom with publication perspective on railway track geometry condition monitoring from in-service railway vehicles [17] has 144 citations, research a fuzzy reasoning and fuzzy analytical hierarchy process-based approach to the process of railway risk information [18] has 108 citations, and railroad maintenance with publication the effects of tamping on railway track geometry degradation [19] has 93 citations.In the UK, the publication Influence of the aerodynamic forces on the Pantograph -catenary system for high-speed Trains [20] has 131 citations, and the research bivariate gamma wear processes for track geometry modeling, with application to intervention scheduling [21], has 55 citations.And the publication of a decision support system for track maintenance [22] has 23 citations.In Portugal research on integer programming to optimize tamping in railway tracks as preventive maintenance [23] has 70 citations.In Italy, research on optimizing maintenance strategies for railway track-beds considering probabilistic degradation models and different reliability levels [24] has 32 citations.
Table 2 shows the order of 16 countries based on publication contributions and information on the number of citations and total link strength.The top 10 contributing countries of scientific publications are China contributing 74 publications, United Kingdom (42 publications), France 919 publications), Portugal (19 publications), Italy (16 publications), Spain (16 publications), America (15 publications), Sweden (13 publications), Netherlands (11 publications), and Australia (8 publications).This shows that China dominates and contributes the most to railroad infrastructure maintenance research.This can be stated by the number of publications, citations, and the total link strength generated.
Overlay visualization in Fig. 4 shows the publication period based on country contributions.The overlay visualization results show China as the pioneer of research publications that raise research topics in the field of railroad maintenance.

Authors contributions
Each author's contribution was examined using coauthorship analysis, where the unit of analysis was the author and the complete counting method was used.The threshold with the minimum number of documents is 3, and the minimum number of citations is 3 for each author.
Based on the Authorship mapping in Fig. 5 Fig. 6 shows the results of authorship mapping overlay visualization.The figure shows that the darker the colour of the node, the older the authorship, and the lighter the node's colour, the younger the authorship.

References with strongest citation
The development of publication citations relevant to railroad infrastructure maintenance research continues to grow yearly.Table 3 above   The effects of tamping on railway track geometry degradation [19] 2013 UK 93 Data-driven optimization of railway maintenance for track geometry [25] 2018 United States 81 Arching mechanism of the slab joints in CRTSII slab track under high temperature conditions [15] 2019 China 64 Microscopic optimization model and algorithm for integrating train timetabling and track maintenance task scheduling [16] 2019 China 58 Markov-based model for the prediction of railway track irregularities [26] 2015 China 56 Optimal scheduling of track maintenance on a railway network [27] 2013 China 50 Stochastic model for the geometrical rail track degradation process in the Portuguese railway Northern Line [28] 2013 Portugal 49 RCM2 predictive maintenance of railway systems based on unobserved components models [29] 2004 UK 46 Unobserved Component models applied to the assessment of wear in railway points: A case study [30] 2007 Spain 45 Numerical modelling of high speed train/track system for the reduction of vibration levels and maintenance needs of railway tracks [31] 2015 Portugal 38 Deep learning for track quality evaluation of high-speed railway based on vehicle-body vibration prediction [32] 2019 Sweden 33 A track ballast maintenance and inspection model for a rail network [33] 2013 UK 33 Optimization of maintenance strategies for railway track-bed considering probabilistic degradation models and different reliability levels [24] 2021 Italy 32 Railway track quality assessment and related decision making [34] 2004 Netherlands 30 Maintenance schedule optimisation for a railway power supply system [35] 2013 Autralia 28 On the use of second-order derivatives of track irregularity for assessing vertical track geometry quality [36] 2012 Sweden 24 Track geometry big data analysis: A machine learning approach [37] 2017 United States 23 A decision support system for track maintenance [22] 2006 France 23 Predictive maintenance in dynamic systems: Advanced methods, decision support tools and real-world applications [38] 2019 France 21 Exploring different alert limit strategies in the maintenance of railway track geometry [39] 2016 Portugal 16 Referring to the lack of citations indicates that the discussion of railroad infrastructure maintenance is an interesting study to be developed and researched.The highest number of citations belongs to titles that address this topic, including The effects of tamping on railway track geometry degradation [19] with 93 citations, Data driven optimization of railway maintenance for track geometry [25] with 81 citations, Arching mechanism of the slab joints in CRTSII slab track under high temperature conditions [15] with 64 citations, Microscopic optimization model and algorithm for integrating train timetabling and track maintenance task scheduling [16] with 58 citations and Markov-based model for the prediction of railways track irregularities [26] with 56 citations.

Area main research subject areas
Subjects of research areas in this paper consist of nine including engineering, computer sciences, social sciences, material science, mathematics, earth and planetary science, environmental science, physics and astronomy, decision sciences and business, management, and accounting.The number and percentage contribution of the fields of study as shown in Table 4.
The fields of study that dominate publications in the period 1998-2021 that discuss railroad infrastructure maintenance are engineering 253 documents, computer sciences 54 documents, social sciences 38 documents, material science 21 documents, mathematics 21 documents, earth and planetary science 14 documents, environmental science 13 documents, physics and astronomy 13 documents, decision sciences 12 documents and business, management and accounting 10 documents.The proportion of the size of the contribution of the field of study to the number of publications relevant to railroad infrastructure maintenance is shown in Fig. 7 are engineering 54.2%, computer sciences 11.6%, social sciences 8.1%, material science 4.5%, mathematics 4.5%, earth and planetary science 3.0%, environmental science 2.8%, physics and astronomy 2.8%, decision sciences 2.6% and business, management, and accounting 2.1%.This shows that the distribution of publications by subject area is still in engineering, computer science and mathematics with little connection to other fields of science such as social science, material science and others.Given that this railroad infrastructure maintenance model has a compilation of many disciplines, it may open up other research opportunities using various scenario perspectives and methods from other scientific fields.

Main research area (keyword co-occurrence analysis)
An overview of the main research areas to be investigated, derived from all keyword mappings [10].This mapping found 928 keywords related to railroads, maintenance models and quality.To dissolve the research more deeply, the number of keywords will be limited by using parameters such as: analysis using cooccurrence, counting using full counting and all keywords as the unit of analysis [10], and using a keyword occurrence threshold of 20.From these parameters, the resulting 20 keywords can be grouped in 6 clusters as shown in Fig. 8. Cluster 1 as shown in Fig. 8 is a collection of 18 items consisting of asset management, machine learning, railway track, tamping, track degradation, track geometry, and track geometry degradation.Cluster 2 is a collection of 4 items consisting of ballast, railway maintenance, track irregularity and track quality index.Cluster 3 is a collection of 2 items condition monitoring, reliability.Cluster 4 is a collection of 2 items including maintenance and railways.Cluster 5 is a collection of 2 items including genetic algorithms and railways.While cluster 6 is 1 item including degradation.In this study, the main areas of research are divided into 3 groups organized according to the occurrence of key data in Table 5.The group consists of group 1 which is a category of frequently used keywords with an occurrence rate of >10.Group 2 is a medium category with a keyword occurrence rate of (6>x>10).Group 3 includes keywords with a rarely used category, namely keywords with occurrence (X<6).Group 1 consists of 5 keywords including maintenance, reliability, track geometry, railway and tamping.Group 2 consists of 6 keywords: condition monitoring, railway track, railway maintenance, track irregularity, track degradation, and track quality index.While group 3 consists of 5 keywords including asset management, machine learning, track geometry degradation, ballast, and degradation.In terms of authors with the highest number of contributions, the top 3 authors with the most publications are Andrare, A.R. (5 documents), Andrews, J. (5 documents), and Kumar, U. (5 documents).Based on the number of citations, the 3 authors with the most citations are Andrare, A.R. (165 citations), Teixeira, P.F.(138 citations), and Liu, R. (119 citations).The main areas of study were dominated by research in engineering (54.2%), computer science (11.6%), and social science (8.1%).
The main research areas were identified using keywords with the period 1998-2021; the main research areas were planning, railroad construction, train operation, timetables, and train travel charts.From 2015-2016, the main research areas were related to railroad monitoring, asset management, and track degradation.From 2016-2018, the main research areas were railway maintenance, track geometry, genetic algorithm, machine learning, and track irregularity.
The main research was also reviewed by keyword occurrence; the top 3 keys were found in group 1 (frequent), consisting of maintenance, reliability, track geometry, railway, and tamping, and group 2 (moderate), consisting of condition monitoring, railway track, railway maintenance, track irregularity, track degradation and track quality index and group 3 (infrequent) consisting of asset management, machine learning, track geometry degradation, ballast, and degradation.The development of research areas that have significantly progressed are the discussion areas of railroad infrastructure maintenance, track geometry, machine learning, and degradation.
This research proposes a holistic development of science relevant to the prediction model of railroad infrastructure maintenance to ensure that railroads not measured by railroad-measuring trains also receive maintenance.The results obtained are expected to increase knowledge, and future researchers can identify and address gaps in research areas necessary for further research to complement the existing research literature.Although testing and analysis have been conducted, this research has some limitations.The depth of discussion needs to be improved, and the use of sources only Scopus data.The results of this research are very useful for knowing the development of current rail infrastructure maintenance research, and until this paper is made, no writing has been identified that performs a prediction model for the value of the track quality index category on railroad tracks that are not measured by railroad measuring trains.
, the sources with the 10 highest contributions consist of the Journal of Rail and rapid transit 22 documents, Transportation research record 8 documents, Structure and infrastructure engineering 7 documents, Tiedao Xuebao Journal of the China railway society 7 documents, Civil comp proceedings 6 documents, Journal of railway engineering society 5 documents, Reliability engineering, and system safety 5 documents, Vehicle system dynamics 5 documents, Construction, and building materials 4 documents and journal transportation engineering 4 documents.
presents the 20 documents with the highest number of citations from the dataset from 1998 to 2021.The countries of origin of the 20 publications with the most citations are China 4 documents, the United Kingdom 3 documents, Portugal 3 documents, France, Sweden, and the United States 2 documents each, and Italy, Spain, Netherlands, and Australia 1 document each.

Fig. 8 .
Fig. 8. Keyword -network visualization.The development of keywords in the research period 1998-2021 shown in Fig. 9.At the beginning of 1998-2015, the main research was carried out on planning, railroad construction, train operation, timetables, and train travel charts.In the middle period of 2015-2016,

Table 1 .
Publication Type

Table 2 .
Country Rankings by Number of Publications

Table 3 .
Top 20References with strongest citation burst.

Table 4 .
Subject of research areas.