Rainfall Forecast of Merauke Using Autoregressive Integrated Moving Average Model

. Climate is an important element for human life, one of them is to agriculture sector. Global climate change leads to increased frequency and extreme climatic intensity such as storms, floods, and droughts. Rainfall is climate factor that causes the failure of harvest in Merauke. Therefore, rainfall forecast information is very useful in anticipating the occurrence of extreme events that can lead to crop failure. The purpose of this research is to model rainfall using autoregressive integrated moving average (ARIMA) model. The ARIMA model can be used to predict future events using a set of past data, including predicting rainfall. This research was conducted by collecting secondary data from Agency of Meteorology, Climatology, and Geophysics (BMKG) from 2005 until 2017, then the data was analyzed using R.3.4.2. software. The analysis result showed that ARIMA model (2.0,2) as the right model to predict rainfall in Merauke. The result of forecasting based on ARIMA model (2.0,2) for one period ahead is 179 mm of average rainfall, 46 mm of minimum rainfall, and 295 mm of maximum rainfall. Thus it can be concluded that the intensity of rainfall in Merauke has decreased and there was a seasonal shift from the previous period.


Introduction
Climate is an important element for human life in some fields, namely fishery, agriculture, etc. Climate change is one of the natural phenomena, in which there are value changes of climate elements, either in natural process or accelerated process due to human activities.Nowadays, climate change becomes common problem between communities, interagency, interstate, and even global scope to gain serious solution because it affects many aspects of life.One of the climate change phenomena is the increased frequency and intensity of extreme climate such as storms, floods, and droughts [1].Carbon emissions will hence lead adverse climate change on short and long time scale.
Global warming has resulted in climate change, which is very influential to agriculture because this sector very depends on climate condition [2].Climate change adversely affects water availability and tends to decrease the harvest quality [3].[4].Additionally, agriculture is one of the important sectors in the provision of food.One of the main commodities of food crops in Indonesia is rice.Papua province is one of the areas with high diversity of biological resources, especially in Merauke City, which is the largest rice producer of rice crops in Papua and potential to become one of the national rice production centers [5].One of the determinants of plant cultivation is rainfall [6].
Rainfall is the amount of rainwater that falls on an area in a certain time.The reduced rain intensity is one of the biggest reasons for the decrease in farmer yields [7].The high rainfall can cause flood, while the low rainfall can cause drought; both of them can lead to crops failure.This is shown on Fitriani's research [5] stating that crops failure in Merauke is due to rainfall.Therefore, rainfall forecast information is really useful to anticipate extreme events that may lead to crops failure.
The ARIMA model is one of the time series analysis models that can be used to predict future events using a set of past data, including predicting rainfall in a period of time [8][9].
The advantage of ARIMA model is it can be used to analyze random, seasonal, even cyclic situation in the time series analyzed.Therefore, ARIMA model is used to predict rainfall in one future period.A research on rainfall forecast using ARIMA model is conducted by Weesakul and Lowanichchai [10].It is about the rainfall forecast in Thailand for rice cultivation water supply.

Data Plot
Before identifying the model, the first step that must be conducted is visual form of data plot, so it is convincing whether the data is stationer or not.However, it is more convincing to plot the autocorrelation values down to zero quickly after certain lag, so the data can be said as stationer.Whereas, if the autocorrelation is down into zero slowly or is different significantly from zero, it means that the data is not stationary.If the data is not stationary, it is necessary to conduct modification to obtain stationary data.One of the commonly-used ways is differencing method mean and transformation on variance.

Correlogram
Correlogram is a stationary identification technique of time series using autocorrelation function (ACF) that is obtained through plotting between ρk and k (lag).Plotting between ρk and k is called population correlogram.Practically, we can only count the sample autocorrelation function.For stationary data, correlogram decreases rapidly as the increase of k.Meanwhile, for non-stationary data, correlogram tends not to zero (slow down).

Unit Root Test
Unit root test used is Augmented Dickey-Fuller (ADF test).Hypothesis used is as follow: H0 :ρk = 0 H1 :ρk ≠ 0 Reject H0 if p-value < 0.05 then it can be said that the data is stationary.

Model Identification
If the data is stationary in mean and variance, it can proceed to see plot ACF and PCAF.Based on ACF and PCAF, it can be identified some possible suitable models.Model identification is conducted to know the existence of autocorrelation and data stationary, so it can be decided whether transformation or differencing process should be performed.

Parameter Estimation
After determining models based on identification of plot ACF and PCAF, model parameter estimation is conducted.One of the method used to predict model parameter ARIMA is ordinary least square (OLS) [11].Parameter coefficient significance test is conducted to know the proper model for predicting.The test used is ttest [10].

Residual Assumption Test (diagnostic checking)
Diagnostic checking is conducted to identify whether the model estimated is suitable enough for the data.Diagnostic checking is based on the residual analysis.The basic assumption of ARIMA model is that residual is an independent random variable distributed normally with zero mean constant variance.

Independence Test
It is to identify whether independent residual can use Ljung-Box test [11].This test uses sample autocorrelation from residual to examine the initial residual.The hypothesis is as follow: H0 :ρk = 0 H1 :ρk ≠ 0 Reject H0 if p-value < 0,05 it can be said that residual has no autocorrelation

2.5.2.
Normality Test To test residual assumption, Kolmogorov-Smirnov is used [12].The formulation hypothesis is as follow: H0 : error data are normally distributed H1 : error data are not normally distributed Criteria for rejected hypothesis H0 test statistics > critical value.

Selection of the Best Model
The selection of the best model uses Akaike's Information Criterion (AIC).This model is based on maximum likelihood estimation (MLE).The selection of the best model with minimum AIC value [13].

Forecasting
The most important purpose of times series analysis is to predict future value [14].If the best model has been selected, the model is ready to use for forecasting.Forecasting method is expected to increase the level of trust towards the next data.

Data Collection
Data used is monthly rainfall data from 2005 to 2017 that is obtained from Meteorology, Climatology, and Geophysics Agency (BMKG) in Merauke.However, the data used in modeling is rainfall data from 2005 to 2016, while data of 2017 is compared to the forecasting result.The data is obtained by asking the related officer about the information by bringing letter of permission to conduct a research.The data obtained is monthly rainfall data from 2005 to 2016.

Plot of rainfall data
The first step of modeling time series is data plot to see whether the data is stationary in mean and variance.If the data has not been stationary to the mean, it is necessary to conduct differencing, and if it is not stationary to the variance, it needs to perform transformation.Plot of rainfall from 2005 to 2016 is shown in Fig 1.Based on the picture above, it can be seen that there is a shift in rainfall, and the intensity of rainfall is decreased.The rainfall peaks its maximum in April 2006, which is 663 mm, and the minimum point happens in October 2015, there is no rain at all (0 mm).The average of rainfall from 2005 to 2016 is 172 mm with deviation standard is 171 mm.A visual observation conducted towards Picture 1 shows that the rainfall data has been stationary towards it mean and variance.To make it sure, there is stationary data, which uses Augmented Dickey-Fuller (ADF).

Rainfall Data Stationary Test
Visually, it can be seen that the rainfall data has been stationary towards its mean and variance.However, to make it more accurate, it is conducted a stationary test using Augmented Dickey-Fuller (ADF) test [15].The formulation of hypothesis is: H0 : non-stationary rainfall data H1 : stationary rainfall data With test statistics is: Criteria for rejected hypothesis H0 if p value α Based on the assistance of software R 4.2.1, ADF obtain obtained 0, 01 0, 05 p value α then H0 is rejected.So, it can be concluded that the rainfall data is stationary.

Identification of ACF and PACF
When the data has been stationary, data is ready to be identified to obtain the prediction of ARIMA model (p,d,q).Identification is obtained through plot ACF and PACF of the rainfall data.Plot ACF and PACF of the rainfall data in Merauke is seen in Fig 2. Based on plot ACF and PACF of rainfall data, it can be seen that significance value on the lag is small, so it is proposed an order for Moving Average (MA), which is order 1,2,3 and for Autoregressive is 1,2, 3.

Prediction of ARIMA Model
Model determining used is based on identification of plot ACF and PACF, so there is obtained 9 ARIMA models combined from the possible models, which is order-p (1,2,3) and order-q (1,2,3).Combination of ARIMA model is as seen in Table 1.

Parameter Estimation of ARIMA Model
After being identified, there are 9 models combined based on Table 1.The next step is to find out parameter estimation of the model.Results of estimation using software R. 3.4.2are listed in Table 2.
Result of Table 2 shows that the best ARIMA model is ARIMA model (2,0,2) compared to other models based on AIC criteria.It shows that ARIMA model (2,0,2) has the smallest AIC value.However, it needs to know whether the parameter is significant or not.Therefore, it is necessary to conduct a parameter significance test.The test for parameter of ARIMA model (2,0,2) is t-test.Results of significant test are listed in Table 3.  and n = 144, so the hypothesis is rejected.Therefore, it can be concluded that the parameter of ARIMA model (2,0,2) is significant.

3.7.1.Residual Assumption Test (Diagnosis Checking) of ARIMA model
After conducting parameter estimation and significance test, white noise test is carried out for ARIMA model (2,0,2).The aim of this test is to diagnose whether residual from the model obtained is no autocorrelation.The result of diagnostic test of ARIMA model using software R. 3.4.2 is seen in Fig 3.

Forecasting
Because all steps of ARIMA modeling have been conducted, and the requirements of ARIMA model have been met, the next step is to forecast rainfall using the model for one period ahead.Table 4 shows is the result of rainfall forecast using ARIMA (2,0,2): Prediction of rainfall is divided into three categories determined by BMKG, that is, 'low' category (0 mm -100 mm), 'medium' category (100 mm -300 mm), and 'high' category (more than 300 mm).based on Table 4, there is obtained a rainfall forecast with 'low' category' on July, August, and September; 'medium' category on January-June 2018 and October-December 2018, and there is no 'high' category.Moreover, based on the result of rainfall forecast, the maximum intensity of rainfall decreases, and there is season shift, that is, the maximum rainfall in 2017 is on April, and the rainfall forecast in 2018 is on February.Meanwhile, based on visual observation, it can be seen that the plot of rainfall of original data and forecast data have the same pattern.The plot time series of rainfall forecast in Merauke is seen in Fig 4. The result shows that rainfall in Merauke in 2018 using ARIMA model (2,0,2) peaks its maximum point on February (295 mm) and minimum point on August (46), while the average of rainfall is 179 mm.This rainfall forecast can be used by farmers to determine the best rice planting time in the medium category.In this time, farmers do not need much additional water from the river, thus reducing the use of water pumping machines.This gives the advantage of reducing the use of fuel oil so as to reduce carbon emissions in Merauke.

Conclusion
Based on the research result of the best model for rainfall forecasting in Merauke based on AIC is ARIMA model (2,0,2), where the maximum frequency of rainfall in 2018 decreases and there is season shift from the previous period.The average of rainfall and the minimum point of rainfall increases.
This model can be used to determine the best planting time so as to reduce the use of fuel oil for water pump which has a positive impact on reducing carbon emissions in Merauke.
thank Musamus University for facilities and supports.This project and publication were supported by DIPA Musamus University

Table 4 .
Result of Rainfall Prediction in 2018