The clique approach to identifying critical elements in gas transmission networks

. We consider the gas transmission network operating on the territory of the Russian Federation. This network includes gas fields, gas consumers, nodal compressor stations, underground gas storages, which, depending on the given scenario of the system operation, can act as gas sources or gas consumers. The nodes are connected by means of gas pipelines. Because natural gas is used in heat and power engineering and electricity, the gas transmission network may be exposed to terrorist threats, and the actions of intruders may be directed both at gas production facilities and gas pipelines. To simulate intruders attacks, a model of the attacker-defender type was proposed. In this model, the defender, represented by the system operator, solves the problem of finding the maximum flow to meet the needs of gas consumers. The attacker, in turn, attempts to minimize the maximum flow in the gas transmission network by excluding either nodes or gas pipelines. Gas transmission networks in Russia and Europe are very extensive, ramified, and have many bridges and reserve gas pipelines. Therefore, to inflict maximum damage to the system, attacks on cliques, that is, on several interconnected objects, are modelled. The article presents the results of test calculations, in which we identify the most significant combinations of objects in the gas transmission network in terms of the potential threat from terrorist attacks.


Introduction
Ensuring an uninterrupted supply of fuel and energy resources to consumers and safeguarding overall energy security is a matter of high state concern [1].Large-scale energy accidents due to the breakdown of various critical energy network facilities entail considerable damage for consumers in terms of severe shortages of end use energy.Therefore, major current challenges include searching for and defining critical elements and their combinations in energy networks.
The above is confirmed by research on energy networks currently conducted all over the world.The studies [2,3] focus on various issues concerning the modeling of energy networks as critical infrastructures.Researchers also investigate the vulnerability of critical energy infrastructures to acts of terrorism and risk analysis methods for interdependent critical infrastructures in extreme weather conditions [4,5].Probabilistic risk assessment methods are adopted when an infrastructure vulnerability assessment is impossible for want of sufficient information [6][7][8][9].If relevant data is available, statistics theory is applied to analyze and forecast the impacts of natural disasters on infrastructure performance [10].Network approaches, such as complex network theory, are used consider infrastructure topology when analyzing its structural vulnerability [11].The latest international research places increased emphasis on interconnected infrastructures [12] and the effects of interactions between them on their vulnerability [13,14].
Natural gas is the dominant boiler and furnace fuel in the Russian energy sector, and the share of fuel consumption by Russia's electric power industry amounts to about 75%.In case of disruptions occurring in the gas sector during periods of peak gas consumption, the amount of electrical power supplied to consumers may decrease by 50% to 60% in certain areas.As of now, about 90% of Russian natural gas is extracted in one gas producing region, i.e. in the north of Tyumen Oblast, located 2,000-2,500 kilometers from main gas consumption areas and 4,000-5,000 kilometers from gasimporting countries.Consequently, almost all Russian gas is transported over long distances through main gas pipelines having many mutual intersections and cross junctions; moreover, major gas pipelines often run at short distances one from another.
Current research suggests that consumers in regions will face considerable shortages of end use energy after accidents occuring at the important intersections of main gas pipelines [15].In addition to main gas pipeline intersections, industrial and nodal compression stations as well as the linear sections of main gas corridors can also be potentially dangerous to the proper functioning of the network.Work is being done to specify, out of all potentially dangerous facilities listed above, the most significant ones and their combinations, followed by the development of measures intended to enhance these facilities' operational reliability with a view to maintaining uninterrupted fuel supply to consumers.
A number of studies have been undertaken to identify critical facilities in gas transmission networks.A list was compiled of the main gas pipeline intersections within the Unified Gas Supply System of Russia whose disruption will lead to a system-wide shortage of daily gas supply equal to or exceeding 5% [15].Research was conducted to search for and identify combinations of various sections of main gas pipelines where a simultaneous disruption may result in a considerable shortage of daily gas supply (5% and more) across the entire network [16,17].Based on practical experience and analysis of current international research [15,18], a methodology was elaborated, using the Russian gas sector as an example, to draw up lists of critical facilities in energy networks in order to ensure the latter's efficiency.
The above studies used the trial-and-test method to identify critical facilities and their combinations.Multiple iterational tests were carried out during which all elements and pairs of elements were turned off, one by one, in the network under investigation.As a result, the network elements and their combinations were identified which, if disrupted, would lead to the greatest shortage of gas within the network.Cases of simultaneous shutdown of three or more network elements were not taken into consideration due to computational limitations.Nonetheless, such cascade emergencies are possible, given the operational specifics of the gas transmission network.This is why, to take a comprehensive and detailed account of various factors in researching critical facilities, this study uses the maximum clique problem aimed at detecting the most interconnected sections of the gas transmission network which, if disrupted, could inflict maximum damage to the network in terms of reduced gas supply to consumers.Another justification for using of the maximum clique problem is the wide geographical distribution of facilities in the Russian gas transmission system.Intruders may have difficulties coordinating terrorist attacks in the wide territory of the Russian Federation.Therefore, a more reasonable approach to planning attacks on gas transmission system facilities may be to consider those objects that are located relatively close to each other, which fits into a clique approach of threats identification.
The clique problem is formulated as part of a methodology for modeling attacks on infrastructures.This methodology connects two sides into a single mathematical framework: an attacker and a defender.Such models are based on a class of Stackelberg network interdiction games [19] in which two actors, a leader and a follower, have opposite interests.Considering the given resource limitations, the leader aims to maximize the damage inflicted to the follower by reducing the network's transmission capacity, increasing transport costs at certain arcs or removing nodes or arcs from a network model.The follower solves his optimization problem (searching, for instance, for the maximum flow of a specific resource throughout the network, etc.) in the network modified by the leader [20].Importantly, the mathematical description of such problems can be reduced to the attacker-defender, defender-attacker and defender-attacker-defender multilevel optimization models that have found widespread application in research on threat modelling and counter measures at various critical infrastructures [21][22][23][24][25][26][27].In a sense, such models identify the best possible protective action plan within a limited budget.
The present study focuses mostly on the safe operation of the Russian gas transmission network in terms of security against terrorism.To this end, the authors will examine an attacker-defender model in which the solution of the attacker's problem is based on maximum clique search.The defender's objective is to ensure the maximum gas flow through the network.This approach will identify the critical network nodes and combinations whose failure results in the largest decrease of the network's overall transmission capacity.The weighted maximum clique problem will also be considered to account for disparities between different nodes of the gas transmission network.
The paper is structured as follows.In Part 2, a brief overview will be provided of the above-mentioned defender-attacker, attacker-defender and defenderattacker-defender models.Part 3 presents a mathematical model of the Russian gas transmission network and a description of the problem of finding the maximum gas flow through the transmission network.Part 4 deals with maximum clique search.Finally, Part 5 articulates the attacker-defender model and identifies cliques using the Russian gas transmission network as an example.In the conclusion, the authors sum up the main findings of the study.

The Attacker-Defender Model
The attacker-defender model [21] is based on the optimization model of an infrastructure network with an objective function representing either its value or expenses from the defender's perspective.
Let us consider that the defender operating the network minimizes expenses represented by the following linear function: , , max min

,, min
which is a bilevel optimization problem.Based on Stackelberg game theory, it comprises a number of key principles: the attacker and the defender act in a consistent manner; the attacker has at his disposal a complete model of the defender's optimal operation of the network even after an attack; and the attacker will manipulate the network to his greatest advantage.
The model (1) allows the attacker to perform a variety of actions.As an example, the aim of an attack may be to increase the defender's expenses rather than to limit his set of actions.The attacker can also reduce the value of network elements to the point of entirely excluding them.

The Defender-Attacker Model
In the attacker-defender model [21], the solution depends on numerous key network elements.The defender uses the available information to devise a protection plan with a view to minimizing the greatest damage on the attacker's art, which produces a complex three-level defender-attacker-defender model.However, in cases where the internal problem of network parameter optimization is lacking or is solved in a trivial way, the result will be a bilevel defender-attacker model: where w is a vector of the defender's choices, W is its feasible set and g is the objective function representative of the damage inflicted to the network.Defender-attacker models occur, for example, in territorial boundary patrol problems [21].

The Defender-Attacker-Defender Model
This section focuses on the mathematical framework of the three-level defender-attacker-defender optimization models [21] permitting an accurate solution under some natural assumptions, i.e. it provides the best possible set of protective measures for infrastructures.In planning protective measures, an important assumption is the attacker's reasonably limited resources; for instance, in case of energy networks, such a limitation may be the upper bound of the number of facilities under attack.Generally, the defender-attacker-defender model looks as follows with account taken of (1): For the sake of simplicity, we assume that, if the network element k is protected, i. ,, max min min where = diag( ) Dd , d is a vector with the highest y component values permitted by the network.A number of specificities to be considered when solving the problem (3) are presented in [21].The authors highlight the possibility to solve this three-level optimization problem in the same way as the defender-attacker problem using Bender's decomposition.In doing so, various techniques have been adopted within the decomposition algorithm to find a solution to the problem (3) [28].
Generally, the problem ( 2) is difficult to solve, given that it is often impossible to reduce it to mixed-integer programming problem and complex decomposition techniques are to be applied [21].

Identifying the Maximum Gas Flow in the Russian Transmission Network
Russia's vast gas transmission network is made of 388 nodes, including the following: 96 consumers, 33 producers, 29 underground gas storage facilities and 230 nodal compressor stations.Nodes with underground gas storage facilities may be used within the network as both gas consumers (if gas storage is required) and producers (if accumulated gas is needed to meet consumer demand).Internodal connection is ensured by 755 gas pipelines.
Network operators have access to the following information about the gas transmission network specifications: gas production volume at production nodes, amount demanded at consumption nodes, capacity of underground gas storage facilities and the transmission capacity of gas pipelines.
The task facing network operators is to determine whether the gas transmission network can provide consumers with the required amount of gas under given network specifications, the transmission capacity of pipelines and the existing production volume.For this purpose, a maximum flow problem is set up [29], in which the following symbols are used: n is the total number of nodes in the model; m is the total number of arcs in the model; I is a set of numbers corresponding to the f is the value of the total network flow; U is the symmetric adjacency nn  -matrix with elements 1, node is adjacent to node , 0, node is not adjacent tonode ; x is the flow rate outcoming from node i and incoming to node j ; the corresponding arc is designated as ( , ) ij The maximum flow problem has the following interpretation: the aim is to find the greatest possible amount of gas that can be transmitted throughout the network under the given specifications of internodal links accounting for the lines' established transmission capacity, available production volumes and the given consumption volumes.Finding a solution to the maximum flow problem determines whether the network can provide consumers with the required gas volume delivered to them along gas transmission lines.Such a problem statement does not consider the gas flow rate for the gas transmission network's in-house needs.To take account of this flow rate, this study increased the systemwide gas flow rate by 10%, this figure being based on numerous previous technical and economic research studies on the operation of Russia's gas transmission network [30].
The following additional operations are to be performed to ensure the proper functioning of the maximum flow algorithm.Add to the set , if < , The problem statement for calculating maximum flow will look as follows: max , , = , 0, { , }, , The problem (4) solved, the network operator gains access to the information on the gas transmission network's capacity to shut consumer load.Additionally, it becomes possible to detect the so-called weak points, i.e. fully loaded sections of the pipeline as well as sections having important transmission reserves.These data can facilitate modifications to the gas transmission network's specifications in terms of increasing or decreasing the transmission capacity of these or those gas lines taking into account consumer load.

Setting Up the Maximum Clique Problem
The most vulnerable targets for attacks within the Russian gas transmission network are suggested to be defined as combinations of interconnected nodes.In other words, the maximum clique problem is set up [31], which can be clarified with terms from graph theory.
A gas transmission network can be represented as a directed graph in the nodes of which are located gas processing plants, gas consumers, underground storage facilities and compressor stations.Gas transmission pipelines are the edges of such a graph.The main objective of an offender is to cause maximum damage to the gas transmission network, that is, to reduce the maximum gas flow through the pipelines by attacking and rendering inoperative the network's major facilities.Needless to say, in planning an attack and searching for the clique, the fictitious nodes Let us assume that the attacker is solving the maximum clique problem in order to deactivate the largest number of interconnected facilities of the gas transmission network.The list below contains the main symbols for describing the maximum clique problem.

= ( , )
G V E is an arbitrary undirected and weighted graph; = {1, 2,..., } Vn is a set of the nodes of the graph  is a set of the edges of the graph G ; 12 = ( , ,..., ) The following problem has to be solved to find a maximum clique: The solution of the problem (5) when determines the set of facilities in the gas transmission network forming a maximum clique.If multiple solutions are possible, the result will be a set of such cliques, which may substantiate further detailed planning of an attack, the attacker's objective being to cause maximum damage to the target.In specifying weight values i w , = 1,..., i n in the problem ( 5), the solutions found can have interesting interpretations.One solution of the weighted maximum clique problem will be presented in Part 5.The solution of the problem (5) allows the attacker to identify the cliques linked to gas production, consumption and storage nodes.

Clique Identification in the Russian Gas Transmission Network
Considering ( 4) and ( 5), the attacker-defender problem for the Russian gas transmission network can be presented as follows: , , , where y is {0,1} -vector specifying the nodes to be attacked (the attacker's plan); max{ ( ) : that define the edges coming in or out of the nodes under attack and removed along with them from the network (the attack's consequences).According to the problem statement and based on the problem's solution (5), the attacker selects a clique which, if removed from the network, causes maximum damage to the transmission capacity of the gas network.Nodes belonging to the clique are removed from the network along with all the adjacent edges.As a result, an attack on a node, first, renders inactive the facilities, located in this node, of the gas transmission network and, second, makes gas transit impossible through this node.
Let us now present maximum clique search and damage caused to the network for different types of clique problems, the weighted and the unweighted ones.The model of the Russian gas transmission network was described by means of the AIMMS modeling environment [32], also used to solve maximum flow and maximum clique problems.The calculations were made on a personal computer equipped with an 8-core AMD FX-8350 processor (each with a clock speed of 4 GHz) and 8GB of RAM.

Gas Network Analysis. Finding Maximum Cliques
The graph of Russia's gas transmission network was analyzed as follows.The problem (5) for this network was solved separately, resulting in the identification of fortyfive cliques (size 3) of potential interest to offenders planning and launching an attack on the gas network.The established cliques constitute the set () Fy.Given that these cliques are small-sized, the authors decided not to place budget limitations on the attacker.The value of the maximum gas flow throughout the network, subject to the presence of all nodes in the model, was calculated and amounted to 2235 milion m 3 /day.This total is set at 100%.The maximum flow problem (6) 1 shows data on the maximum flow with account of the excluded clique.Given that the number of cliques found is substantial, the cliques are grouped in terms of the damage caused to the gas transmission network.The Excluded cliques column specifies (in brackets) the number of excluded cliques, the damage resulting from which is, individually, within the range indicated in the Maximum flow column.The % column shows the damage expressed as a percentage of the maximum flow in contrast to the option with no attack.(14) 2161-2221 3-1 6 3-node clique (13) 2235 0 As can be seen, the steps taken by the attacker in relation to a number of node combinations do not create any difficulties in providing consumers with the necessary amount of gas: as few as thirteen cliques of this kind have been detected.However, six 3-node cliques (gas network facilities physically interconnected by gas pipelines) were identified, the disruption of which will lead to a systemwide shortage of gas ranging from 11% to 39% of overall consumption.The importance of these combinations confirms that a 39-percent shortage of gas within the network is higher than the shortage resulting from the disruption of the most important critical facilities and their combinations [18].

Gas Network Analysis. Finding Weighted Maximum Cliques
Depending on modelling objectives, a specific weight can be assigned to each node in the graph.When searching for potentially vulnerable cliques, it seems logical to set the volume of gas produced as weight for the network's nodes.In this case, the reason for excluding maximum cliques is to sabotage the facilities producing the network's highest volume of gas.Table 2 summarizes the results of the calculations.The Clique weight column shows the total volume of gas produced by all the nodes included in the clique.The clique identification approach in terms of the maximum volume of gas produced helped detect the production facilities within the gas transmission network interconnected by gas pipelines.Interestingly, the most efficient gas processing plant with no connection to other gas producing facilities ranked first in significance.Ensuring its protection will avoid a possible 13.5-percent decrease in the maximum flow.In the problem examined above, there are four maximum cliques, of size 2. The remaining of the twelve identified cliques are of size 1, which points, in a sense, to gas producing plants' isolation from one another.In this case, due to the small size of the cliques, the authors decided not to place budget limitations on the attacker.The presented analysis of the gas transmission network will enable the defender to implement, on a priority basis, a range of defensive measures with respect to the identified cliques.These measures will concern only the nodes in which gas processing plants are located.

Conclusion
Detecting critical combinations of facilities in the gas transmission network makes it possible to plan defensive measures and to reduce, in case of an attack, potential damage to the network's facilities such as gas processing plants, underground gas storages and compressor stations.This study presents an approach to identifying critical combinations of facilities in the gas transmission network by solving the maximum clique problem.Importantly, this approach does not guarantee the identification of the maximum damage that can be caused to the gas transmission network if the established node combinations are excluded.Nonetheless, finding a solution to the maximum clique problem helps assess the significance of this or that combination of the network's facilities with a view to preventing eventual attacks on critical facilities.In doing so, it is possible to proceed to analyse larger node subsystems without using iterative procedures for step-by-step exclusion of network elements and their combinations, which complements the previous work done on this topic.Furthermore, solving these problems allows for modifications to be introduced to plans for the long-term development of the gas transmission network in order to minimize the significance of the identified facilities and their combinations.Besides, the solution of the maximum weight clique problem suggests some major implications.The authors have identified node cliques with the highest volume of gas production.The identified node combinations are of considerable importance in terms of the gas transmission network's safe operation, and intensified safety monitoring of these facilities is a priority for the defense.
a vector of operating expenses and y represents choices or actions related to network operation under given constraint yY  .The attacker tries to maximize the optimal operational expenses by forbidding a set of steps represented by the vector y the attacker's choices, i.e. his plan of attack.Let us consider that, if =1 j related to the network element under attack.Therefore, the attack on the network element k forbids any set of actions directly dependent on this network element.Some reasonable constraints on the attacker's resources combined with the binary type of variable x are representative of the constraint xX  whereas () Yx is a set of the defender's feasible actions limited by the plan of attack x .As a result, the attacker solves the following type of problem:() model's nodes ( = {1,2,..., } In ); p II  is a set of numbers corresponding to producers' nodes; c II  is a set of numbers corresponding to consumers' nodes; 0 II  is a set of numbers corresponding to branch nodes;