Evaluation of Effective Maintenance and Reliability Operation Management – A Review

Manufacturing operations are often carried out with one major goal in mind: to develop products that satisfy increasing customer demand for reliable and high-quality goods. Achieving this target is often hindered by factors such as machine downtime caused by faults in the machinery or production process which leads to time and financial losses. Hence, it is important for operation managers to ensure measures are employed at every stage of production to effectively address these factors and enable unhindered production. Some commonly used strategies include timely maintenance of machines and facilities and reliability analysis of components and processes. This paper assesses the influence of effective maintenance and reliability practices for operations management.


Introduction
Operations management is a business aspect related to managing the process of converting inputs into products in the form of goods and services [1]. It handles a variety of strategic tasks such as manufacturing plant size and project management techniques. Other aspects include inventory management, acquisition of raw materials, quality control, maintenance policies and material handling [2]. It mainly involves planning, organization, and supervision of the process of manufacturing, production or providing services. It is delivery-focused by nature, concerned with the efficient transformation of material inputs to outputs [3]. Due to increasingly complex business environments, companies are driven to continually enhance organizational performance, pushing decision-makers such as operation managers to increase the efficiency and effectiveness of production activities [1]. The industrial process of manufacturing used nowadays involves mass production and largescale assembly plants which feature numerous machines and equipment working in collaboration with one another to complete a given product [4]. In such a setup, operational management is necessary to properly guide all production activities and ensure effective use of materials and resources.
The equipment/machinery used in the production process is prone to failures caused by factors such as wear or fatigue, faulty design and sometimes, continuous operation beyond permitted limits. Failure of these components is during production is something that must be avoided. Facilities such as food processing plants would be severely affected by downtime (inactivity) due to machinery failure as losing a single piece of equipment could halt production, leading to loss of product and in turn, profits. Machine downtime for a food packaging as-assembly line can cost around $15,000 per hour, a considerably large amount of revenue to be lost. Such unnecessary costs can be avoided through proactive maintenance and reliability assessment of equipment, as design life of most equipment requires periodic maintenance [5] [6]. Developing efficient techniques for maintenance and reliability assessment of production systems is thus crucial for ensuring adequate performance under varying operating conditions. In addition, a fully reliable production system ensures company sustainability in an ever-changing business environment [7]. This paper discusses the concepts of reliability and maintenance as well as examines the benefits of implementing effective maintenance and reliability in operation management.

Maintenance
Maintenance plays a key role in preserving the design life of any given piece of equipment or plant. Adjustment of loose belts, proper lubrication of parts and replacement of faulty components are all basic maintenance practices to improve equipment life. Machines that are properly maintained are capable of holding tolerances better, reducing scrap generated, and improving consistency and quality of parts being produced [8]. 'Maintenance' refers to the process of keeping machines and equipment in good condition to maintain efficiency and extend their useful life. It comprises different taken by the organization to replace, repair and maintain the components and equipment of the plant, which allows continuous operation within satisfactory limits [9]. Maintenance management can thus be regarded as a restorative function of production management focused on keeping equipment/machines and plant services constantly available in proper operating condition [10].
Various maintenance management techniques (see Figure 1) for ensuring the maximum design life of equipment are reached and/or exceeded have been developed over the years. These include predictive maintenance, reactive maintenance, preventive maintenance, and reliability centred maintenance [10].

Preventive Maintenance (PM)
This is often used to refer to timed or meter-based service activities that are used to extend equipment life and identify potential problems through inspection and early detection. This can include tasks performed on certain equipment through service contracts, cleaning activities, inspections, lubrication tasks, testing, etc. Inspection plays a major role in preventive maintenance as it leads to early detection and correction of faults [11]. It is the most popular maintenance management strategy for production operations [12].

Predictive Maintenance (PDM)
This method differs from preventive maintenance as the need for maintenance is based on the actual condition of the machine rather than using a pre-set schedule [6]. It involves the application of computer and digital technologies for condition monitoring of equipment to allow early detection of faults and provide greater precision in intervention. It usually includes monitoring techniques such as electrical surge comparisons, ultrasonic and vibration analysis, oil and coolant analysis, thermographic analysis, wear particle analysis, etc. [11]. This method is easily used in conjunction with preventive maintenance as it provides additional data required to develop an efficient predictive maintenance plan. It is commonly employed in large organizations that have dynamic operations [12].

Breakdown (Reactive) Maintenance
This method basically involves a "run until broken" maintenance process. No effort is made to maintain the equipment as initially designed to ensure it reaches the end of its design life. Due to its simple and straight forward approach, it is still the prime maintenance management strategy [6]. It is also referred to as "Runto-failure" management and is mostly suitable for noncritical components such as light bulbs and batteries, or other redundant components that do not affect the production process [12].

Reliability-Centred Maintenance (RCM)
This is a maintenance planning concept that ensures that systems continually function in line with operator requirements at a given time. Basically, it is the process of determining the best effective maintenance strategy for equipment within a facility by combining all other maintenance methods [13]. The merits and demerits of the different maintenance strategies for machinery/equipment are presented in Table 1  • Improves production quality ─ More labour intensive ─ Unnecessary maintenance of equipment (loss of man hours and revenue) ─ Can cause early deterioration of equipment [5,7,8,9,11,12 ] ─ Predictive  [5,7,9,11,18]

Reliability
Different definitions for reliability have been put forth by different authors over the years. Generally, it could be defined from two main perspectives i.e., either the user/consumer's perspective or the producer's perspective. 'Reliability' describes the ability of an item or system to perform a required function for a specific period of time under specific conditions. [13] [14].In this case, the "system" is either an electronic or mechanical hardware product, a software product, manufacturing process, or service [15] [16]. It is "the probability of functioning without failure throughout the life cycle of a product or within a given amount of time operating in specific environmental conditions" [17]. It is broadly defined as the science of predicting, analysing, preventing, and mitigating failures that may develop over time [18].

Associated Terminologies
Fundamental terms when dealing with reliability include [7]: Failure, failure rate, mean time to repair (MTTR), mean time to failure (MTTF), and mean time between failures (MTBF), A 'failure' refers to an inoperable event or condition in which an item or part of an item does not perform as previously specified or does not perform. The term 'failure rate' describes the expected failure rate or the number of failures within a specified period of time. It is usually expressed in failures per million or billion hours. MBTF refers to the number of hours elapsed during a failure, MTTF refers to the average time to irreparable system failure and MTTR refers to the expected time from a system failure (or shutdown) to completion of repair or maintenance. It is only used for systems that can be repaired. The design reliability of a component or asset usually differs from its reliability observed while in operation i.e., inherent versus operational reliability. Hence, when considering reliability in design, it is necessary for designers to understand that real-world operating conditions and maintenance techniques will vary from design assumptions and may affect overall system reliability [22]. Ad-additionally, it is important to note that the operational reliability of a production line is dependent on different factors, namely [20]: human reliability (personnel involvement in management planning), process reliability (equipment operating within design conditions), equipment maintainability and reliability.

Reliability Engineering
Given that the quality of a product declines with use over time, a time-based approach for quality is required in place of a simple initial assessment by an inspector during product design or at completion. Rather than evaluating a product on passing or failing a given test, reliability examines failures in the time domain, a process which distinguishes standard quality control and the practice of reliability engineering [19]. Reliability engineering is mainly concerned with prediction and avoidance of component or system failure. It is necessary to understand why failures occur, how they happen and how often as well as the associated costs inorder to properly quantify product or system reliability. In real life situations, all potential failures are rarely fully recognized or understood. Hence, failure prediction in reliability analysis deals primarily with probabilities [20][21]. The main functions of reliability engineering are to develop the reliability requirements of the system, design the product or system in accordance with the reliability requirements, establish the appropriate reliability program, perform the ap-propriate analysis, and perform the appropriate analysis of the system or product during its life [22].

Reliability Methodologies
Evaluating the reliability of a process or product can be carried out using a variety of reliability analyses which are suitable for different stages of a product lifecycle. It is a continuous process that begins at a product's conception stage and lasts all through the phases of product life [14]. Some methodologies used in reliability engineering include maintainability prediction, Reliability Prediction, Reliability block diagram (RBD) analysis, Fault tree Analysis (FTA), Weibull analysis, and accelerated life testing (ALT) analysis, etc. [23]. The most widely used techniques for reliability analysis are Reliability prediction (RP), RBD, FTA and Failure Mode Effects Analysis (FMEA) [24].

Reliability Prediction
This method predicts component failure rates and the overall system reliability. It is also useful for accessing the significance of reported failures. It examines the operational state of a given component or system at a particular time, whether it is currently functional or has failed [25]. The method is based on the fact that a functional component or system will undergo failure eventually. For a non-repairable components or systems, the failure state lasts forever whereas a component or system that can be repaired remains in the failed state for a period of time during repair before returning to a functional state once repair is finished [26][27][28]. It is assumed that the transition from one state to the other is instantaneous. 'Failure' marks the https://doi.org/10.1051/e3sconf/202130 E3S Web of Conferences 309, 01012 (2021) ICMED 2021 901012 transition from being functional to a failed state, and the change from functional state to failure is called repair. The repair is also expected to restore the component or system to a new state [29][30]. This cycle follows a repair-to-failure process and a continually repeating failure repair process for serviceable systems (see Figure 2). Where, = total uptime = total downtime = number of breakdowns Results obtained from a reliability prediction analysis are also useful when carrying out additional analyses such as FTA (Fault Tree analysis) or RBD (Reliability Block Diagram) analysis [7].

Reliability Block Diagram (RBD) Analysis
This is a deductive technique for evaluating component or system reliability. RBD provides a graphical analysis of the logical structure of the system on which some reliability connections exist between parts or the entirety of the system. It allows the representation of the different ways by which partial systems or components in operation affect overall system operation [7]. Reliability block diagrams provide a model of how a complex system functions by using "blocks," which represent the operation of a component or subsystem. They enable one to determine both component reliabilities and overall system reliability. RBD can also be used to optimize the reliability assignment to system components by taking into account potential reliability improvements and associated costs of different modifications to system design [31]. It is a success-oriented network that illustrates how the operation of different functional blocks affect system function [32]. System components and their relation to one another in terms of reliability are expressed graphically using either series, parallel block configurations or a combination of series and parallel blocks (see Fig. 3a and 3b) [27]. Where n = number of components.

Fault Tree Analysis (FTA)
This is a systematic way to define and analyse system failures based on failures in various combinations of components and subsystems. It could also be described as a systematic approach to identifying the main causes of events, such as fault events. It takes a top-down approach that starts with a basic idea or general event, then diverges into more specific details. From the toplevel scenario, the previous events that may have caused the result are mapped through the fault tree diagram (see Figure 4) [29].

Fig. 4. A Fault Tree Diagram for a Car System Failure [29]
Fault trees can be used to emphasize the dependence of system design on specific components, adding redundancy for various components and the need for other design changes when the system is unreliable. It is also useful for analysing the root cause [29]. The FTA is an effective tool for evaluating how multiple factors affect a particular problem and provides a visual representation of different failure modes. It is useful for developing monitoring programs and risk assessment [7].

Markov Analysis
This is a technique used to predict the value of a variable whose predicted value is only affected by the current state and not by previous activities. It involves defining the possibility of a future action occurring given the current state of a variable. Once the probability of future action in each state is determined, a decision tree can be drawn. Then, the likelihood of a result can be determined, provided the current state of a variable is known [30]. It is a time-dependent approach as probability of working state varies with time. It is also an inductive analysis method useful for evaluating structures or systems with complex functions and https://doi.org/10.1051/e3sconf/202130 E3S Web of Conferences 309, 01012 (2021) ICMED 2021 901012 maintenance strategies. A Markov model is a state diagram model that includes circles and arrows. The component states (i.e., functional or failed) are represented by the circles while the direction of transition between the two states (failure or repair) is indicated by an arrow. The failure rate or repair rate is indicated by a numerical value with an arrow (see Figure 5) [7].

Fig. 5. A Simple Model for Markov Analysis [7]
The system is in state S0, if functional, or in state S1, if failure occurs. The system transitions from state S0 to state S1 at a rate of λ (failure rate), or from state S1 to state S0 at μ (rate of repair) [31].

Failure Mode and Effects Analysis (FMEA)
This is a reliability analysis method that evaluates the possible failure modes of a process and its potential impact on the performance of the product or system. It can be applied to equipment and systems, as well as the analysis of manufacturing operations and the impact on products and processes. FMEA results can be useful as a foundation for product/process design, additional system analysis or to aid deployment of resources [7]. Also known as "Failure modes, effects, and criticality analysis (FMECA)", it involves a step-by-step approach to identifying failures that can occur in any of the service, product design or manufacturing or assembly processes [32]. It is a technique geared towards enabling organizations mitigate failure during the design stage by recognising all the likely failures in a design or manufacturing process. It could also be described as a structured approach for identifying possible failures that might be present in a product or process design [33]. Two general categories of FMEA are [32]: Design FMEA (which examines the possibility of flaws in products, shortened product life, safety and regulatory concerns caused by material properties, tolerances, noise, etc.) and Process FMEA (which discovers failure that affects product quality, reduced process reliability, safety, and environmental hazards due to materials and machinery used, processing methods, human factors, etc.). Depending on whether a product or process is to be evaluated, FMEA is carried out using a worksheet (see Fig. 6).

Fig. 6. Sample of a Process FMEA Worksheet
Reliability is often used as an all-encompassing concept that also includes availability and maintainability. At its core, reliability deals mainly with the probability of a failure taking place at specific time intervals, whereas availability is a measure of something being in a state (mission capable) ready to be tasked [34].

Relevance of Effective Maintenance and Reliability in Operation Management
The optimal goal of any production plant is to increase the reliability of overall production process i.e., maximizing output using available resources through waste reduction in equipment reliability and process reliability [35][36][37][38]. Manufacturers must ensure system reliability as it is crucial for effective cost analysis, continued customer satisfaction, reducing warranty costs, improving business image and future opportunities, the company's image as well as maintaining competitive advantage. Strategies for increasing operational reliability and extending equipment life are provided in [38][39][40]. Some of these include training plant employees, using high-quality lubricants for parts of machinery in motion, building redundancies into the production process, consistent maintenance, and cleaning, and applying automation to systems. The advantages of proper equipment maintenance include the following [41][42][43]: i. Increase in life of machinery and equipment. ii.
Production remains on schedule. iii.
Customer satisfaction is high as products are timely delivered. iv.
Good condition of machinery equals good product quality. v.
Zero production loss. vi.
Utility of staff and machinery is increased as there is no idle time. Applying reliability engineering to the management of production operations provides the following benefits [36]: i. Achieve ultimate customer expectations regarding functionality and expected life of a particular component, product, or system. ii.
Reduces health and safety risks to all predictable equipment. iii.
Improves system reliability and availability (reduces failure rates). iv.
Production goals are accomplished. v.
It helps improve marketing and warranty of products. Failure to implement the necessary maintenance and reliability assessments for production equipment/machinery will lead to results contrary to those outlined above. Additionally, company policies dealing with operational reliability must integrate security, quality, risk, and financial requirements to achieve their business objectives. Policies should be established by senior managers so that they are easy to understand, trustworthy, legitimate, and guide the organization towards achieving common goals and objectives [44][45][46][47][48].

Conclusion
From the sections discussed, it is evident that proper maintenance and reliability are vital tools for operations management. Predictive and preventive maintenance are recommended for machinery in production plants in place of reactive maintenance which presents greater costs in the long run. Among the different tools for reliability analysis, FTA and FMEA are commonly employed by different organizations as they are relatively low-tech techniques that is easy to under-stand and implement. Additionally, FMEA highlights the relationship between failure modes and its causes, which aids further reliability analysis and removal of system faults. For large-scale manufacturing and production operations, having an effective maintenance and reliability framework is necessary for preventing untimely system failure and unnecessary downtime of machines which greatly limit production capacity as well as incur time and financial costs, amongst others. Operation managers must ensure that these tools are implemented according to available standards and strictly adhered to by employees at all levels of production within the organization. This will enable production operations to run smoothly and ensures that the organization maintains optimal production capacity.
We acknowledge the financial support offered by Covenant University in the actualization of this research work for publication.