Sales Volume Forecast of Typical Auto Parts Based on Bi-GRU: A Case Study

. Inventory management is an important part of the auto parts supplier business. Accurate prediction of sales volume for di ﬀ erent auto parts is the basis for sta ﬀ to formulate marketing strategies and procurement plans. Based on the limited historical sales data of the South China, North China and East China branches of an auto parts company, some prediction models are trained and tested to determine the best model for predicting future production sales. An orthogonal experimental method is used to implement hyperparameter estimation for the prediction models on this basis. In addition, a posteriori test is used to verify the validity and accuracy of the Bi-GRU model in predicting the sales volume of typical auto parts. The results show that, compared with other models, the bidirectional gated recurrent unit (Bi-GRU) model has the highest accuracy in testing and is used to predict the future sales of typical auto parts. The posterior test proved that the validity and accuracy of the Bi-GRU model is veriﬁed. The orthogonal experiment method can e ﬀ ectively realize the hyperparameter estimation for each model. According to the prediction results, the sales volume of blind drive caps in South China, North China and East China will reach 18235, 17030 and 14949 pieces, respectively, after 90 days. Meanwhile, the corresponding sales volume of bolts will reach 13141, 15062 and 10253 pieces, respectively.


Introduction
After more than 100 years of development, the automobile industry has become a pillar industry that is closely related to economic development and social progress [1,2]. During that time, the vigorous development of the automobile industry has injected vitality into the related trade and service industries [3]. Among them, auto part sales, which are a key marker of auto after-sales maintenance services, are currently in the process of intelligent and lean development [4]. For auto part sellers, accurate forecasting of commodity sales is helpful for managers to judge inventory levels in advance, formulate reasonable purchasing and sales plans, and implement lean commodity inventory management. The development of artificial intelligence technology makes it possible to accurately predict the sales volume of commodities [5][6][7].
The chapters in this study are as follows: a literature review is presented in Section 2. In Section 3, the basic situation of auto part suppliers is introduced. The mechanism of the Bi-GRU model is analyzed in Section 4. In Section 5, the total sales data of the three branches of the auto parts supplier are trained, tested and predicted based on the Bi-GRU model, and the results are discussed. The final section concludes the study, and future research to be conducted is indicated.

Literature Review
Some amazing achievements have been made in the field of commodity sales forecasting. He et al. introduced the cumulative and translation transform and proposed an optimized gray buffer operator to predict the production and sales of new energy vehicles in China [1]. Combining principal component analysis and a general regression neural network, Wu et al.
proposed a machine learning model to predict the sales volume and growth rate of electric vehicles in China as well as worldwide. The results indicate that the sales volume of electric vehicles will continue to grow but that the growth rate will decline [2]. Sharma et al. considered various potential factors to predict the sales of Amazon books and implemented factor importance analysis, which proved that the artificial neural network achieved better prediction results [6]. Xiao et al. proposed a commodity sales combination prediction model based on the differential evolution algorithm, which is superior to the combination prediction model based on the cross variance weight coefficient in prediction accuracy [7]. Geol et al. developed a long-short-term memory (LSTM) sales model based on characteristic uncertainty to predict the global sales of online business of boutique tourism and resort hotels [8]. Xia et al. developed a ForeXGBoost model to predict automobile sales. The model uses the missing value data filling algorithm to improve the data quality and uses the sliding window to extract the characteristics of historical sales and production data, realizing the rapid and high-precision prediction of automobile sales [9]. Hu et al. used a random forest model to incorporate spatial autocorrelation into housing sales price prediction and proved the superiority of the method through a case study [10]. Pavlyshenko uses a machine learning model with a stack strategy to predict commodity sales based on actual cases [11]. According to expert experience, the external factors affecting the sales of auto parts were determined by Türkbayraǧí et al., and a neural network prediction model was proposed to predict the automotive aftermarket in Turkey [12]. Sohrabpour et al. proposed a model for forecasting export sales volume using genetic programming and conducted parameter sensitivity analysis [13]. Wan et al. proposed a similarity-based sales forecasting (S-SF) method using the idea of collaborative filtering. The proposed S-SF method can be applied to both mature and new product sales forecasting, which has good diversity [14]. Ma et al. proposed a meta-learning framework based on a deep convolutional neural network to predict retail sales [15]. Türkbayraǧí et al. determined the relevant external factors of automobile aftermarket sales based on expert review and put forward the sales forecast model of the automobile aftermarket industry, which achieves a high prediction accuracy [16].
Existing research has reference value for this work. In this study, an auto parts seller is taken as the research object. The managers of the company only recorded the daily sales volume of each type of part in the past, and the relevant factors that may affect the sales volume, including temperature and season, have not been recorded in this case. Therefore, the autocorrelation time series prediction of sales volume is the focus of this study. Because of the good performance of the Bi-GRU model in solving time-related data prediction [17][18][19][20], it will be used to realize training, testing and prediction on a limited set of sales history data.

Problem Description
A company (code A here) is an auto parts supplier, specializing in providing a full set of spare parts for 4S stores of major well-known auto brands. The company has three branches, which are established in South China, North China and East China, to meet the cargo needs of different regions. Blind rivet caps and bolts are the best-selling auto parts of the company. Therefore, they are selected as representative products to predict future sales volume. Their numbers in the company are N90771003 and N90974701. In figure. 1, the sales volumes of these two products in the three branches are recorded. South China, North China and East China branches were recorded for 504 days, 319 days and 447 days, respectively. It can be found from figure. 1 that the daily sales volume of the products is random, so it is necessary to adopt an appropriate model to predict their future sales volume.

Bi-GRU Model
The GRU is an improved recurrent neural network that effectively solves the problems of gradient disappearance and gradient explosion [21][22][23]. GRU not only handles time series information well but also has a simpler internal structure and is easier to train than LSTM. The internal structure of the GRU is shown in figure. 2. First, two gated states are obtained through the hidden state passed from the previous node h t−1 and current input X t : reset gate r t and update gate z t . Here, σ is the sigmoid function.
Then the reset gate is used to "reset" the data to obtain h containing the current node information.
Next, the update gate is used to "update the memory", selectively forgetting the original hidden state h t−1 , and selectively remembering h containing the current node information. When predicting, only past information is used by GRU, but usually future information helps to improve the accuracy of model prediction, so Bi-GRU is proposed [17]. The architecture of Bi-GRU is shown in figure. 3. The working principle of Bi-GRU is to obtain two hidden layer states with opposite time series through forward GRU and backward GRU, and then connect them to obtain the same output. The forward GRU and backward GRU can obtain the past and future information of the input sequence, respectively.

Results and Discussion
The formulation of an inventory management plan is usually based on the total sales volume of each commodity in a certain period in the future, so the total sales volume of each commodity is determined as the prediction target in this study. The first 2/3 of the data recorded  in each branch office is used as a training set for model training, and the last 1/3 is used as a test set to evaluate the performance differences between different models. The unidirectional GRU model was selected as the control group for the Bi-GRU model. Considering the advantages of the LSTM unit in processing time series data, unidirectional and bidirectional LSTM are used to predict the sales volume of typical auto parts. Furthermore, since support vector regression (SVR) and tree-based models such as gradient boosted decision tree (GBRT) generally have good performance in solving nonlinear problems [24][25][26], they are also considered.
Before model training and testing, the original data are normalized according to Eq. (6), which is helpful to speed up the solution.
where, x is the total amount of warehouse operations and x is the normalized value.
For the Bi-GRU model, it is necessary to optimize the hyperparameters and network structure to improve its prediction performance. The orthogonal experiment method can determine the optimal hyperparameter combination through a few attempts, which greatly reduces the calculation cost. Compared with grid search technology, this method has significant advantages in hyperparameter estimation for deep learning models. Therefore, it will be used to conduct hyperparameter estimation for each model in this study. The Bi-GRU model predicting the commodity N90771003 in South China is selected as an example, and the detailed process of the orthogonal experiment method is described here.
In table 1, there are five adjustable hyperparameters selected in the Bi-GRU model, and each hyperparameter is set with four levels in this study. Therefore, the L 16 (5 4 ) Taguchi table is selected. According to the scheme, simulation experiments are carried out. To reduce the influence of random factors on the model test results, each group of simulation experiments is repeated 10 times. The average value of 10 tests was taken for the experimental results of each group. The coefficient of determination (R 2 ) is selected as the measurement standard for this orthogonal experiment, which is calculated according to Eq. (7).
where n represents the number of predicted values; y i represents the original output value;ŷ i represents the predicted value;ȳ represents the average of the original output values; andȳ represents the average of the predicted values. The simulation results of the orthogonal scheme are presented in table 2. The range analysis of the results can determine the degree of influence for each hyperparameter on the model performance and the best combination of hyperparameters. In figure. 4, the results of the range analysis are plotted as the main effects plot for the Pareto ratios. The response table of the S/N ratio for the Pareto ratio includes contribution degrees based on range statistics, and specifically, the large fluctuation of the S/N ratio indicates that this factor has a great influence on the results. It can be seen from figure. 4 that hidden_layer has the greatest impact on the performance of the Bi-GRU model, followed by unit and time step. Dropout is the least important hyperparameter in this case. The larger R 2 is, the better the fitting performance is. Therefore, in this orthogonal experiment, the best combination of hyperparameter values is: hidden_layer=2, timestep=20, batch_size=25, dropout=0.15, and unit=50.
Similarly, this method is also used for hyperparameter estimation in other cases. The optimal values of the network structure and hyperparameters are shown in table 3. In this study, the Bi-GRU model with two hidden layers was determined to be the most suitable for solving this problem. The regression results of different models on typical auto parts sales are shown in figure. 5.
To quantitatively evaluate the performance of different models, the mean absolute error (MAE) and R 2 are selected as evaluating indicators. MAE is calculated by Eq. (8).
The MAE and R 2 of the test results fitted by the six models are shown in table 4. When the fitting result is the best among several models, it is underlined. In table 4, regardless of the MAE index or R 2 index, the Bi-GRU model can always achieve the best or second-best result. In general, the Bi-GRU model performs best in the test set, followed by the GRU model. The R 2 of all fitting results of Bi-GRU almost exceed 0.97. The GRU model is better than the LSTM model in solving the problem of predicting the sales volume of auto parts in this study. This phenomenon may be caused by the small number of samples in this study. In addition, the results in table 4 suggest that deep learning networks are generally better than shallow machine learning models in solving the prediction problem of time-related data.
Considering the good performance of the Bi-GRU model, it is employed to predict the future sales of typical auto parts in three branches. Based on the above research, the total cumulative sales volume is selected as the prediction target. Auto parts inventory management is usually based on the total sales volume in the next few months. To further verify the validity and accuracy of the Bi-GRU model, the actual sales data of the following 28 days after model training are used for a posterior test. The results of the posterior test are shown in figure. 6, and the corresponding evaluation index values are shown in table 5. For the convenience of observation, we subtract the previous sales volume from the predicted sales volume, and draw it as the ordinate in figure. 6. It can be seen from table 5 that the R 2 of the sales volume forecast results in the following 28 days is approximately 0.96, so the validity and accuracy of the Bi-GRU model are confirmed.
Next, the Bi-GRU model is used to predict the future sales of typical auto parts. In figure. 7, the sales volume of blind drive caps and bolts in the following 90 days after model training are predicted. The sales volume of blind drive caps in South China, North China and East China will reach 18235, 17030 and 14949 pieces respectively in 90 days. Meanwhile, the corresponding sales volume of bolts will reach 13141, 15062 and 10253 pieces, respectively.

Conclusion
To accurately predict the sales volume of auto parts, two best-selling products were selected as representatives. Based on the sales record datasets, some machine learning models were trained and tested. Compared with the LSTM, GBRT and SVR models, the GRU model achieved better performance in the test set of typical auto parts sales, and the Bi-GRU model  performed better than the GRU model. The R 2 of all fitting results of the Bi-GRU model almost reached 0.99. The actual sales data from the following 28 days after model training were used in a posterior test to further verify the validity and accuracy of the model. The R 2 of the sales volume forecast results in the following 28 days is approximately 0.96. Therefore, it is feasible to use the Bi-GRU model to predict the future sales volume of typical auto parts. In addition, the sales volume of blind drive caps in South China, North China and East China will reach 18235, 17030 and 14949 pieces respectively in 90 days. Meanwhile, the corresponding sales volume of bolts will reach 13141, 15062 and 10253 pieces, respectively.
The influence of external variables on the sales volume of auto parts should be considered and discussed in future research, to provide more specific guidance for the formulation of sales strategies and procurement plans.