Propylene Yield from Olefin Plant Utilizing Box-Cox Transformation in Regression Analysis

. Propylene yield is one of the key operating parameters that is monitored daily in the running olefin plant. This study was conducted in the actual world-scale olefin plant to measure the impact of identified controlled variables on the propylene yield. The Box-Cox data transformation was adopted in the Regression Analysis using Minitab Software Version 18 due to non-normal data were observed after normality and stability test were conducted using Box Plot, I-MR Chart, Run Chart, Graphical Summary, and Normality Plot tools. The model concluded that propylene yield in the studied plant was contributed by the factors of -0.000243 Hearth Burner Flow, 0.01332 Integral Burner Flow, and 0.08598 Naphtha Feed Flow. The Response Optimizer tool also suggested that the propylene yield from naphtha liquid feed can be maximized at 11.22% with the control setting at 10,993.86 kg/hr of Hearth Burner Flow, 604.61 kg/hr of Integral Burner Flow, and 63.50 t/hr of Naphtha Feed Flow.


Introduction
Propylene yield monitoring is essential in the steam cracker furnace as its value translates to the profit generation and sustainability of the olefin plant. This study was conducted with a focus on propylene yield at the newly commissioned olefin plant. The plant was designed to produce 645 KTA capacity of polymer grade propylene product from naphtha feed via pyrolysis cracking in the steam cracker furnace.
Pyrolysis cracking refers to a reaction that causes hydrocarbon bonds to break and form the smaller unsaturated molecule [1,2]. The high valuable olefins such as ethylene and propylene are normally produced from the elevated temperature cracking reaction [3,4] in the steam cracker furnace. Pyrolysis cracking was primarily developed in the early 1920s with a focus to enhance the quality and quantity of gasoline components for the refinery [5] and the technology was continuously developed for olefin production. It is also the heart of the olefin process where the main reaction in the olefin plant occurs via pyrolysis cracking in the steam cracker furnace.
Due to this, pyrolysis reaction is often regarded as the key factor for the smooth running of the olefin plant operation. Conducting the study on the pyrolysis reaction in the actual plant conditions is more challenging than the lab-scale experiments due to the extreme process fluctuation [6] in the olefin plant resulted from dynamic operation in the upstream processes.
Olefin plant producing ethylene and propylene utilizing thermal cracking is often defined as the core of petrochemical manufacturing [7,8] due to its significant impact on the industry. The furnace is the primary equipment in the olefin production industry [9] where its safe and stable operation is essential [10,11] to determine the yield and quality of olefin [12] produced Fig. 1 shows the configuration of the steam cracker furnace in the studied plant while Table 1 shows the naphtha feed specification utilized in the steam cracker furnace during the study.
The process starts from naphtha introduction into the steam cracker furnace at the first convection bank and mixes with Dilution Steam (DS) in the middle bank. The mixing is designed to improve olefin selectivity mainly by reducing the partial pressure of naphtha feed [13,14]. The reduced partial pressure from the introduction of DS will favor the reversible reaction towards the olefin side following the Le' Chatelier's Principles [15,16] and therefore will improve the Propylene Yield significantly.
The mixed feed of naphtha and DS will be further cracked in the furnace coils at the radiation section that is operated at elevated Tube Metal Temperature (TMT) of 1,050 °C -1,180 °C, with the controlled short residence time (SRT) of <2 seconds. This condition will maximize light component yields such as ethylene and propylene product. This mixture which is now called a cracked gas continues being processed at the Transfer Line Exchanger (TLE) to rapidly cool the cracked gas effluent. The Boiler Feed Water (BFW) was utilized as a cooling medium in the TLE.  The cracked gas that is quenched in the TLE will continue going to the downstream equipment for further quenching, compression, cooling, and product separation. Super High Pressure (SHP) steam generated from TLE will be used to drive Charge Gas Compressor (CGC) turbine at the downstream process.
Linear regression is a well-known statistical approach used to correlate the relationship between various input parameters towards achieving the target output systematically. It was proven successful from previous studies [6,[17][18][19] in showing the relationship of each input variable towards the output variable. It is also practical to be used for the specific process monitoring and planning including for the large-scale olefin plant.
A Box-Cox transformation is a way to transform non-normal dependent variables into a normal shape to give a broader number of tests. This model proposed by Box and Cox [20], aimed at reducing anomalies in data and ensuring the usual assumption for a linear model hold [20,21]. This transformation results from modification of power transformation which account for discontinuity at λ = 0 [22,23]. Box-Cox Transformation is also intensively studied and useful to be used as a fundamental tool in the Regression analysis [22,23].

Equipment/Tools
The study was conducted utilizing the industrial scale SRT VII furnace and its auxiliaries including Steam Drum, Induced Draft Fan (IDF), Burners, and TLEs. The studied furnace was designed to process 93 t/hr of straight-run naphtha liquid from the upstream plant. The studied plant was carefully designed by an established Olefin Licensor, Lummus Technology Heat Transfer (LTHT) from the United States. The site construction was completed by Toyo Engineering, Japan.
The data collection was conducted using Process Information Management System (PIMS) Software, PI Process Book Version 2015. The analysis of these data was conducted using the Regression analysis in Minitab Software Version 18.

controlled variables namely Hearth Burner, Integral
Burner, Dilution Steam Flow, Naphtha Feed Flow, and Coil Outlet Temperature were chosen as the input to the Propylene Yield in the studied plant. The data for the analysis was extracted hourly (time-weightage) from PI Process Book. The date was selected on 24 th January 2020, 1900 hrs until 2 nd February 2020, 1200hrs (207 hrs total). This resulted in a total of 1,242 data records for the analysis (established from 1 output and 5 input variables).
The analysis was conducted based on actual plant conditions analyzed from these 1,242 historical plant data used for the Regression model. Therefore, the model established from this data was representing the current plant condition at the studied plant. Besides, this study also emphasized the model development for Propylene Yield with Naphtha Feed specification as shown in Table 1. Therefore the model established was limited to best represent the process condition with feed specification range close to The verification on the data stability was conducted using Box Plot, Individual-Moving Range (I-MR), and Run Chart while verification on data normality was established using Graphical Summary and Normality Test in Minitab Software. These steps were taken prior decision to conduct analysis utilizing Box-Cox Transformation. Box-Cox Transformation will be applied to the Regression analysis if the majority of these data tests were found unstable and non-normal.
This study was conducted without thorough outlier removal and data cleaning since it was intended to see the impact of Regression using Box-Cox transformation for the non-normal data. The P-Value for both stability and normality check should be below 0.05 for the analysis to proceed with Box-Cox Transformation.
Once the Regression analysis started, the analysis of the data was conducted to ensure the Variance Inflation Factor (VIF) value was lower than 10 and P-Value was less than 0.05 for all studied variables.
VIF is useful to quantify the severity of the multicollinearity relations in ordinary least squares of the Regression analysis. VIF value is recommended to be <10 [24] or <5 [25] to reduce the multicollinearity in the final model. The high VIF in the Regression analysis may result in the unreliable final equation model.
The Regression was conducted a few times until satisfactory VIFs and P-Values were achieved. Once achieved, the residual removal was conducted to the latest Regression model utilizing the Normality Plot and Individual Moving Range (I-MR) chart produced from the identified residuals in the Regression analysis.
Final Regression was reassessed to all variables (including previously excluded variables) after the successful removal of all residuals in the latest Regression. It can be seen from the I-MR Chart (no red point outlier) and Normality Plot (P-Value >0.05).
The Response Optimizer tool was applied to the final model to predict the maximum value of Propylene Yield with its process setting utilizing actual plant data and model developed from the Box-Cox Transformation.
Besides, the Surface Plot in form of a threedimensional graph was also applied to see the relationship of each significant variable in the final model. It was helpful to correlate the response values for two continuous significant variables based on the final model equation towards the fitted response value, Propylene Yield.

Results and Discussion
The results of stability and normality verification utilizing Box Plot, I-MR, Run Chart, Graphical Summary, and Normality Test are shown in Table 2. From the analysis, all data from studied variables found non-normal except for Dilution Steam Flow. Therefore, Regression analysis was conducted for all input variables utilizing the data transformation technique using Box-Cox Transformation in Minitab Software. No outlier removal and data cleaning were conducted before the Regression analysis except for normal data verification (to remove irrelevant data where required).
The Regression analysis utilizing Minitab Software was conducted a few times until all VIFs and P-Values reading were satisfactory to develop the reliable mathematical model for Propylene Yield in the studied plant. Table 3 shows the initial (1 st ) and final (9 th ) Regression analysis results conducted during the study.  From the analysis, VIF and P-Value in the 2 nd Regression already met the criteria VIF <10 and P-Value <0.05. However, the R-Square in the 2 nd Regression was achieved at only 54.43%. This translated to only 54.43% of the data was accounted into the model. However, this value was not resulted from utilizing Box-Cox Transformation, it was mainly due to the variation of data collected during the study period.
Residual removal was conducted to improve the R-Square of the model. These residuals represented the difference in actual data and predicted data calculated from the Regression model. The Unusual Observations in Fit and Diagnostic Table in Minitab Session Window and Residuals in I-MR Plot were removed from 3 rd Regression until 9 th Regression (Final). Fig. 2 and Fig. 3 show the Probability Plot and I-MR chart of residual for the 9 th Regression (Final) respectively. The P-Value of 0.986 in Fig. 2 is higher than the minimum level of 0.05 and therefore this plot was accepted. Besides, there was also no residual detected outside of the Moving Range Plot in Fig. 3 which concluded that the final model was good in predicting the Propylene Yield in the studied plant.
The final model obtained from the 9 th Regression is shown in Eq. (1). (1) Where, λ = -5, g = 10.8290 is the geometric mean of Propylene Yield Table 4 shows the summary for the final model. Its R-square was found at 63.7%, which represented 63.70% of the data variability was accounted for in the final model. This value is lower than the advised 75% for experiments conducted in a controlled environment [26,27]. However, there is no fixed rule for the acceptable R-Square value which depends on case to case basis. The value of 63.70 % was also good considering the study was conducted in the actual large-scale plant where process variations often occurred. The R-Square of this final model was also better compared to only 54.43% without residual removal in the 2 nd Regression model.     Fig. 4 (a) and Fig. 4 (b) show that manipulating lower operating HB Flow with higher IB Flow and Naphtha Flow may result in a higher Propylene Yield. Besides, Fig. 4 (c) shows that the higher IB flow combined with higher Naphtha Flow may also result in a higher Propylene Yield in the studied plant.
Overall, Fig. 4 suggested that manipulating IB Flow and Naphtha Flow are more favorable towards achieving a higher Propylene Yield in the studied plant. This analysis results also aligned with the model developed in Equation 1 earlier where IB Flow and Naphtha Flow contributed to the highest factor of 0.01332 and 0.08598 respectively towards Propylene Yield. Fig. 5 shows the optimum process condition in achieving the maximum Propylene Yield from the identified significant variables while Table 5 shows the Multiple Response Prediction for the best Propylene Yield in the studied plant. The Response Optimizer result showed that Propylene Yield in the studied plant can be maximized at 11.22% with the optimized value of 10,993.86 kg/hr of Hearth Burner Flow, 604.61 kg/hr of Integral Burner Flow, and 63.50 t/hr of Naphtha Feed Flow. The range of the process condition to maximize the Propylene Yield with a 95% confidence level also can be taken from the High and Low range settings in Fig. 5. This setting is helpful as a guide for Panel Operators and Operations Engineers in the studied plant to maximize the Propylene Yield.

Conclusion
The Propylene Yield model was successfully established in this study using the Box-Cox Data Transformation in the Regression Analysis. The maximized Propylene Yield that can be achieved from the model was identified at 11.22% which can be obtained by careful monitoring to the significant variables which were Hearth Burner Flow, Integral Burner Flow, and Naphtha Feed Flow.