Short term solar irradiation forecasting using Deep neural network with decomposition methods and optimized by grid search algorithm

. Due to the variable nature of solar energy, it is necessary to manage a bilateral contract negotiation between suppliers and customers. Therefore, to fulfil this condition, this paper proposed an ensemble approach to forecast the solar irradiation. The signal processing techniques Variational Mode Decomposition (VMD) and Discrete Wavelet Transform (DWT) used with deep neural network to forecast the solar irradiation. The hyperparameters of deep learning model are optimized using grid search optimization with in a suitable tolerable search range. The data of three years (2012-14) is used; where data of year 2012-2013 is used to train model and testing is done on data of year 2014 for New Delhi location. Among all developed models, Bi-LSTM-VMD-Grid Search performance is better in terms of RMSE (5.456W/m2), MAPE (0.948%) and R2(0.924%), Because Bi-LSTM process the information twice and faster than other algorithms and VMD refine the quality of input data better as comparison to DWT. The result of proposed model is compared with existing techniques that predicted the solar irradiation and the forecasted results are more efficient and reliable.


Introduction
Since the concept of Artificial Intelligence (AI) has been introduced a lot is changed in the solar irradiance field, researchers have developed various algorithms to implement AI to solve many real-life problems. AI signifies artificial intelligence which basically a broader term and it is a superset of Machine Learning (ML). Aim of AI is to mimic human like behaviour like their intelligence, problem solving skills, their actions. Whereas Machine learning is to implement numerical solutions to solve the problems based on feature selection. For example, weather forecasting uses a machine learning model to analyse the data in almost real-time. Deep learning (DL) is subset of ML which uses Artificial Neural Network (ANN) to obtain outputs. In this research, deep neural network algorithms are used for Forecasting of solar radiation.

Deep Learning
Deep learning deals with algorithm that are based on structure and function of human brain. It works in similar way that our brain works that is based on learning and refining approach. Due to its high accuracy and high precision on large data as compared to ML (Machine Learning) it is best suited for forecasting of solar radiations which generally require large dataset to predict future values of radiations accurately and precisely. LSTM has been used in the reference [1] to predict solar radiation in which researcher showed the application of usage and the opportunity for modification of model. Bi-LSTM has been used in the reference [2] and researcher concluded that Bi-LSTM algorithm gives comparatively optimized outputs. LSTM is also used in reference [3] to predict solar radiation and concluded that algorithm used gives great performance.

Literature Survey and Research Gap
Despite recognizing the usefulness of DL in predicting solar radiation, there is still ongoing debate regarding the optimal models and methodologies to use. The unpredictability of models output is mainly depended upon volatile nature of solar radiation, which further depends upon atmospheric conditions such as quality of ambient air, temperature, cloudiness, latitude, and seasonality. To address these issues, a literature review has been conducted and figured out gaps in current research. The reference [4] study compared composite algorithms to standalone algorithm for solar radiation forecasted values and indicated that standalone approach may be inaccurate, requiring greater model improvements and inventions in hybridization procedures. To improve the forecasting accuracy data is first pre-processed using various signal processing methods then fed to DL network. In reference [5] researchers have used NARX based hybrid model to forecasting and demonstrated that hybrid models are better in prediction with least errors. To get optimal results researchers used LSTM based hybrid algorithm [6]. On the basis of literature survey, academics began experimenting with signal processing approaches combined with DL, with promising results. The aim and major contribution of this manuscript is mentioned below: The paper includes decomposition technique and transformation such as VMD and DWT. Integration of these with deep learning algorithms to predict solar radiation.
• Identifying and selecting significant time leg data from a dataset of 2012 to 2014; New Delhi location to evaluate the factors accountable for and prominent in the field of solar radiation prediction. • Dataset for solar radiation is broken into decomposed value by using DWT and VMD signal pre-processing technique. • Obtaining and analysing the main factors of broken decomposed values and extract suitable information. • Deep learning approach is used on processed dataset to present the accuracy and impact. Year Algorithm /Methodology Output parameter [7] 2017 LOLIBEE, MLP Algorithm 95% R 2 value [8] 2020 CNN-LSTM 3.044% average improvement [9] 2022 EEMD-GA-LSTM 29.22% average improvement [10] 2022 Bi-LSTM, CEEMDAN, GA 28.66% average improvement [11] 2023 Monte Carlo Simulation 40% increment in solar power generation [12] 2023 Boost Converter with high simulation device 95% increment in experimental result [13] 2023 Two axis solar tracking system Incremental solution with best result [14] 2023 Smart solar bench with optimum angle Optimum result with best tilt angle 3 Pre-processing techniques used

Discrete Wavelet Transform
The DWT uses a succession of high-pass and low-pass filters to divide a signal into several frequency bands with various resolutions. It is frequently used in image and audio processing applications for noise reduction, edge detection, and feature extraction as well as data compression methods to minimise the size of big data sets. Discrete-time signals that are sampled at regular intervals are what the DWT uses to operate [15]. DWT can be statically represented in Equation (1).
Where, x1 is scaling coefficients, x2 is translation coefficient, Y is wavelet function, Xin is input, t is total values.

Variational Mode Decomposition
Solar radiation data is broken down into a variety of periodic modes using the signal processing method known as VMD in solar forecasting, enhances the accuracy and understandability by minimising a cost function that balances the smoothness and discretization of the decomposed modes [16]. In VMD input signal is passed to Hilbert transform then Fourier transform is used to execute harmonic mixing. By using H 1 (Gaussian smoothness) and L 2 (Gradient) bandwidth is obtained for the mode and resultant problem is written in Equation (2). This equation for constrained problem and if the problem is unconstrained then Equation (3) is used.
Where, εk is Dirac function, Xk is input frequency, cfa is a centre frequency, ma is modes ranges from 0 to (T-1).

Gated Recurring Units
GRU utilize gating mechanisms to selectively update and reset the network's hidden state, enabling the network to remember or forget information as required. In contrast to LSTMs, GRUs have just one "hidden state" unit represented in Equation (4) and control information flow with an "update gate" represented in Equation (5) and a "reset gate" represented in Equation (6). GRUs have been demonstrated to be successful in natural language processing tasks while being computationally less expensive than LSTMs, making them appealing for real-time applications [17].
Where, is hidden state at current timestamp, ( −1) is hidden state at previous timestamp, is update gate at current timestamp, is the hyperbolic tangent function.

Long short-term memory
LSTM networks are made up of memory cells that are linked together by gates that regulate the flow of information and it is capable of capturing long-term dependencies in data, which is critical for generating accurate predictions based on past data [18]. The model analyses this data over time and predicts future amounts of solar radiation. The LSTM network has three inputs SI(s), previous memory previous memory cell output H(s-1) and bias EF. As a result, the activation value can be written Equation (10) and Equation (11). Discarding or keeping the information in LSTM is determined by using set of equations such as forgot gate mentioned in Equation (8)   Now, the memory cell output represented in Equation (12) and hidden state in Equation (12).
Where, , , , represents the bias voltage of LSTM, , , , are the weight factor of LSTM network & value of sigmoid lies in range [0 to 1].

Bi-directional Long Short-Term Memory
Bi-LSTM is superior than LSTM in solar radiation prediction. Twice the data is processed in Bi-LSTM by functioning in both forward and backward directions simultaneously, allowing it to capture more complex temporal connections. It can model the time series of solar radiation data as well as other relevant factors to discover their correlations and create accurate predictions. However, the model's success is determined by the quality and quantity of data used for training, as well as the specific characteristics of the problem. Overall, Bi-LSTM is a preferable choice for solar radiation prediction due to its capacity to catch complicated patterns in data [19,20].
The Bi-LSTM network is updated with the help of parameter i.e., forward hidden layer (Hf) represented in Equation (14), backward hidden layer (Hb) in Equation (15) and output sequence SIo(s) in Equation (16).
Hf, Hb & SIo(t) represent the forward parameter, backward parameter and output sequence while W denotes the weight factor.

Flow Graph and Model Evaluation Metrics
To accomplish the objectives of the research, first the research areas and associated datasets are chosen then implementation of the chosen model and then by using evaluation metrics performance and evaluation will be measure. As predictions using deep learning are dataoriented and data can contain multiple attributes and datapoints therefore, pre-processing of the data is needed to determine those attributes that will have most impact on prediction and discarding the attributes that have negligible effect on prediction. Segregation of data is also done here as there may be some false values and noise may also be there so it is required to process data accordingly. Figure 1 gives the general model description.

Data Description
In the study, the dataset of New Delhi is used to evaluate the proposed model. According to the koopen climate classification system, the New Delhi has mixed climate characteristics of  'cwa' and 'bsh'. Its mixed characteristics of climate provide the model to perform on the different weather conditions. Three-year hourly data (2012-14) of New Delhi (Capital of India) location collected from National Solar Radiation Database (NSRDB) for training, validation and testing purpose [21]. Many academics have used NSRDB data in their research due to various advantages 1) free and easily access 2) extensive temporal and spatial coverage 3) no missing value in the data. NSRDB provides satellite-based data which acquired using a satellite to irradiance model created by State University of New York. The collected data from NSRDB containing hourly GHI values and several other meteorological variables. Two-year data used for training and one year data used for testing the developed model on seasonal basis: winter (December to January), spring (February to March), summer (April to June), monsoon (July to mid-September) and autumn (September end to November). Table  2 gives a geographical coordinates, climatic condition and clear sky hours details of the selected location.

Normalization
To enhance the accuracy and speed of the prediction model calculation, and to mitigate the effects of solar irradiance data dimensions, the data is normalized using the min-max normalization method in this paper. This involves scaling the data between its minimum and maximum values [22,23]. Equation (17) gives normalized values in the range [0,1].
Where, is normalized value, is original value, is maximum value present in data and is minimum value present in data.

Evaluation Metrics
The evaluation metrics is used to determine the performance of the model such as weather the model is accurate, precise or time taken to do prediction, etc. In this research MAPE, RMSE and R 2 is used. MAPE refers to mean absolute percentage error, RMSE refers to root mean square error whereas R 2 is coefficient of determination. The RMSE described in Equation (18) is a performance indicator used to assess the reliability of a machine learning model's predictions. It calculates the average difference between expected and actual values, with a smaller number signifying improved performance [24,25]. (19) is a performance statistic that calculates the percentage difference between real and predicted values, which is valuable when the scale is essential. The lower the MAPE value higher the performance. The proportion of variance in the target variable explained by the variables that are independent in a regression model is measured by R 2 which is described in Equation (20). It is a number that ranges from 0 to 1, with higher values suggesting a better match. It is an effective indicator for assessing the performance of regression models.

MAPE represented in Equation
Where is real values, is predicted values, t is total values.

Proposed Work Implementation
Models have been established depending upon research gap and main objective of this work is to implement these models on dataset. Two methods are used that is VMD and DWT which are basic signal processing techniques and three deep learning methods are used known as Bi-LSTM, LSTM and GRU.

Hyper-parameter Selection Using Grid Search Algorithm
In literature any type of rule and regulation is not present to select the hyper-parameter. The selection of hyper-parameter is possible by changing the parameters values within a particular range. Table 3 showing selection of parameter within a particular range. Figure 4 represents the specific graph used for selecting optimum value. The following are the rules for selecting hyperparameters are mentioned below [10,26].

Result
This paper uses three-year data (2012-14); New Delhi location to evaluate the performance of deep learning networks. DWT and VMD pre-processing are used to divide collected data into multiple fundamental functions. The performance of developed model is measured using RMSE(W/m2), MAPE(W/m2) and R 2 (%). Table [4,5,6] showing the result of developed model 1-hour ahead using time series input data. As per the Delhi Tourism website, threeyear data is split into five seasons: winter, spring, summer, monsoon, and autumn [14(21)]. Model developing and compilation is performed using MATLAB 2020A on system having Intel i3 processor and 16GB ram. The acquired input data was examined to determine how the attributes were related to the solar radiation value. In determination process DWT and VMD decomposition methods were used to extract useful information from input by analysing and separating into different modes. Both methods were particularly used in features selection and removal of noise or interference. DL algorithm received clean or filtered data that were processed by decomposition methods.   Figure 5 shows Bi-LSTM model with VMD decomposition achieves maximum R 2 (0.924%) and have minimum RMSE (5.456 W/m 2 ) and MAPE (0.948 W/m 2 ) as comparison to other combinations. It is seen in results of developed model's performance is best in autumn season while worst in winter season due to the reason of large fluctuation in measured and forecasted value.   For a more thorough evaluation of the results, Figure 6 shows a statistical representation of actual and forecasted GHI for the autumn and monsoon season. These two seasons are used to compare the outcomes of the season with the lowest RMSE (autumn) and the season with the highest RMSE (monsoon). However, for better understanding, only the actual and forecasted GHI curves of the proposed model for the selected seasons are given. From Figure 6 shows significant changes in the real GHI lead to more inaccuracy in the results and the smooth curve of the autumn season is an instance of clear atmospheric conditions that the model can identify with ease. If the monsoon season is considered, it is observed that significant changes in the real GHI lead to maximum inaccuracy, as rain makes it hard to identify. From Figure 6, it implies that if the variation of the actual GHI is significant, the resemblance between the actual and forecasted GHI is low. Similarly, the resemblance between the actual and estimated GHI is higher when there is less variance in the real GHI. However, within a tolerated range of error, the suggested model also encounters several ambiguities associated with genuine GHI. As a result of these findings, the suggested model serves as a good forecasting model for both stable and unstable seasons.

Comparison of Present Work with Previous Published Work
Author and year of publication

Data Points
Actual GHI Forecasted GHI

Conclusion
The primary objective of this investigation is to accurately forecast the solar irradiation. In literature various authors used various machine learning technique to forecast solar irradiation but due to some shortcomings of these techniques, deep learning network used in this study. The data of three years (2012-14) used for this study to forecast one hour ahead solar irradiation. The most different thing in this research is to use two different preprocessing techniques: DWT & VMD to refine the quality of input data. The grid search algorithm optimizes the deep neural network's hyperparameters. The developed model's performance is estimated using MAPE, RMSE and R2. Among all developed models, Bi-LSTM-VMD is better in all perspectives. The developed model's performance is excellent in autumn because less irregularities between predicted and measured GHI and highly worst in winter season due to less correlation between forecasted and measured value. Proposed Bi-LSTM/VMD approach shows the favourability due to low errors [RMSE (5.456), MAPE (0.948)]. Since Bi-LSTM operates both forward and backward, it is particularly suited for time-series prediction problems where past data points have a strong impact on future values. Due to this reason this model gives better results and also perform better in other locations. The most typical task in this research is to find the optimum learning value of deep learning model. So, in future the researchers may use automatically tune technique to select the parameter and they may can use sky images or other input data to forecast the solar irradiation.