Network fault diagnosis and rectification Suggestions for 300MW unit under I/A Series system

. With the progress of science and technology, DCS has been widely used in the field of power plant automation. It integrates modern computer technology, modern control technology, modern communication technology and CRT technology to realize the monitoring and control of power production process. DCS network adopts 10Mbps Ethernet, which is the data transmission channel of DCS system. Its stability is related to the safe operation of the whole power plant. Therefore, more and more power generation enterprises begin to attach importance to network security management of DCS. In this paper, the author puts forward his own views on the network failure phenomenon of Foxboro I/A Series system, and gives the solution, hoping to have some reference significance for the safety management work of other similar DCS network of the unit.


Introduction
I/A Series system is an open computer control system developed by FOXBORO company. The system makes full use of the computer technology, communication technology, automatic control technology and display technology to the production process for centralized monitoring, operation, management and decentralized control, effectively simplifies the process control system based on conventional analog meter is multifarious, greatly reduce the intensity of the process operation, and improve the efficiency of the process monitoring and the stability of the system; At the same time, by using the control processor (CP) for real-time detection and control of process parameters and signals, decentralized control is realized, which greatly reduces the dependence of various control circuits on the monitoring computer and effectively avoids the risk of centralized control. However, in some early products, the network configuration is not perfect, which leads to the DCS system has security risks.

Fault description
During the operation of the unit of company A, the DCS system suddenly displays "*" in the screen parameters during the operation, and the equipment status is abnormal. The history station, operator station and engineer station are offline, so the operation status cannot be monitored.

3.1.DCS network topology
DCS system of A company adopts I/A Series system produced by FOXBORO company in the early stage. The system is divided into three network segments, unit 1 and unit 2 are each one network segment; The public system is a cascade network segment on the network segment of unit 1 and unit 2. Two Enterasys A4H124-24FX 24 interface switches are configured in each unit and utility system. The bus structure is suitable for small networks, with only two switches and low network costs. The network segment structure diagram of a single unit is shown in figure 1.

3.2.DCS network status
Field bus adopts IEEE802.3 protocol. The protocol adopts Carrier Sense Multiple Access with Collision Detection medium Access control technology, and the transmission rate reaches 10Mb/s. When coaxial cable is used as the medium, the transmission distance is 184m, and when optical cable is used as the medium, the transmission distance is 2Km. IEEE802.3 USES the 1-persist monitoring algorithm, which enables devices on the bus to timely preempt channels and reduce idle period. The device does not start sending immediately after listening to the network change from active to quiet, but waits for a minimum frame interval and continues listening during the transmission.
Since only a few CP60FT controllers in DCS have been upgraded to FCP270, most CP60FT controllers still share the same NCNI communication interface.
This CSMA/CD technology can be in serious conflict at peak data times. This situation will lead to a sharp decrease in bus efficiency, in extreme cases, there may be blocking phenomenon. The mechanism of blockage generation is shown in figure 2.  Since the upgraded FCP270 controller is directly connected to the redundant switch, the switch adopts the store-and-forward working principle, and there is no conflict domain problem.

3.3.Switch reason
The three network segments of unit 1, unit 2 and DCS are connected by the two-layer switch, which has no routing function.Therefore, the two units and public systems belong to the same broadcast domain and cannot completely isolate broadcast data storms, as shown in figure 3.

The solution
The I/A Series control system of company A has been running for more than ten years. Due to aging and other difficult maintenance problems, FOXBORO has introduced controllers FCP270 and FCP280 to replace CP60FT. Versions after I/A8.0 use MESH networks composed of commercial switches. Mesh Control network is a DCS Control network launched by Invensys Process Systems based on the development of network technology in recent years. The communication standard is ieee802.1w and the transmission rate is 100M/1G. The transmission distance is 2Km when multi-mode optical cable is used as the medium and 10KM when single-mode optical cable is used as the medium. Networks can have bus, ring, star, and tree structures. The design idea of MESH control network is to provide multiple communication paths between any two devices in the network, so as to avoid single or multi-point faults and improve the redundancy performance of communication. The network topology of FCP270 is shown in figure 4.  In order to solve the defect of STP protocol, IEEE introduced 802.1w standard at the beginning of 21st century. It is also a spanning tree protocol type, called the fast spanning tree protocol (RSTP), which supplements the 802.1d standard. The reason why IEEE 802.1w protocol is formulated is that although IEEE 802.1d protocol solves the problem of endless circulation caused by link closure, the convergence process of spanning tree still needs a long time. IEEE 802.1w RSTP is characterized by incorporating many value-added spanning tree extension features, such as Portfast, Uplinkfast, and Backbonefast, into the original IEEE 802.1d. The IEEE 802.1w protocol provides fast failover for switches (Bridges), switch ports (Bridges), or entire lans by utilizing an active bridge-to-bridge handshake mechanism that replaces the timer function defined in the IEEE 802.1d root bridge. IEEE 802.1w sets Alternate Port and Backup Port for fast switching, and when the root/specified Port fails, Alternate Port/Backup Port goes into the forwarding state without delay. These two important improvements make convergence faster.  Table 2. Several common network failure tolerance times.

Application field
Allow time Non-real-time automation systems, such as enterprise resource planning, mechanical execution systems. ＜10s General automation systems, such as: man -machine interface, SCDA, building and automation. ＜1S Factory automation systems, such as: manufacturing automation, process automation, power plant. ＜100ms Real-time automation systems such as: synchronous drives, robotic controls, substations. ＜10ms As can be seen in table 1-3, for the power plant process automation system, the best network performance belongs to the star topology. However, any single network can not meet the requirements in the control system with high reliability requirements, and the network redundancy must be used to improve the reliability of the network. For in the practical work, A company should strengthen the DCS system of real-time monitoring and running state evaluation, monitoring and evaluation program should include optical fiber connection situation, equipment aging and network traffic, etc., these projects are related to the quality of network communication, through effective monitoring and maintenance of system, can play A role to ensure continuous safe and stable operation of the system. In the future, the star redundant network should be the first choice in the overall upgrade of the controller, which can better meet the requirements of the process automation of the power plant. At the same time, the interdomain isolation three-layer switch is added between unit 1 and unit 2 and the common system of the unit to completely isolate the broadcast storm between network domains.

Conclusion
During DCS operation, network operation is highly efficient and reliable, but many factors will cause network failure, including hardware and software factors, power supply factors and human factors. For the power plant DCS system network fault happened can take to hardware upgrade, port configuration and structure optimization as the core, the processing method based on this, at the same time also requires periodic inspection and maintenance of the system, guarantee the system reliable operation, reduce the failure rate, improve the ability of isolation in extreme conditions network fault. Through in-depth fault analysis, a feasible network optimization method is proposed to provide reference for network fault diagnosis and treatment, so as to ensure the safe and stable operation of the system.