Ensembling two deep learning algorithms to efficiently solve the problem of predicting volatility in applied finance

. Volatility is one of the most commonly used terms in the trading platform. In financial markets, volatility reflects the magnitude of price fluctuations. High volatility is associated with periods of market turbulence and sharp price fluctuations, while low volatility characterizes more relaxed pricing. When trading options, it is especially important for trading firms to accurately predict volatility values, since the price of options is directly related to the profit of a trading firm. A proactive artificial intelligence model that allows predicting volatility for future periods of time will be presented in this article.


Introduction
All financial companies are interested in obtaining an accurate forecast of volatility for the nearest future time interval. Thanks to such information (a justified forecast of short-term volatility), any company gets the opportunity to make a unique offer to the market (based on knowledge of the predicted share price, the company can lower the cost of buying shares) and differentiate its company from competitors [1].
One of the global trading companies is Optiver, whose specialists have created an open data set containing thousands of lines of detailed financial information, which will be enough to form the generalizion ability of an artificial intelligence model that predicts financial market volatility [2].
To solve any applied forecasting problem, it is necessary to choose the appropriate machine learning algorithm correctly. In this regard, the task of forecasting volatility is no exception to the rule. The subdomain of problems related to forecasting trends, finance and time series is successfully solved by deep learning algorithms (deep learning) [3]. The problem under consideration ideally fits the criteria for using deep learning algorithms, since financial markets are inextricably linked with the forecast of numbers and time series.
Python was chosen as the initial programming language. Based on it, it is very easy to operate with the main methods and basic models of a wide range of deep learning algorithms. This factor allows you to significantly optimize the listing of program code and increase the convenience of writing it [4].
The problem formulated in this article can be effectively solved on the basis of artificial intelligence algorithms in two ways: 1. To find all relationships in terms of data, determine the functions of their relationship, evaluate the correlation of unrelated data. To transfer all the features of the subject area to the artificial intelligence model, transforming, if necessary, the architecture of the algorithm itself. 2. To leave the search for relationships in the data to the artificial intelligence algorithm Let's start in order of points: to determine all the relationships in the data, it is necessary not only to have an excellent knowledge of the entire subject area, but also to understand how individual parameters are related (or not related) to each other. However, such a situation is impossible due to the fact that even specialists -subject matter researchers -are not able to create and, a fortiori, transfer a fully connected mathematical model into a programmatic form of an artificial intelligence model [5].
In this case, the search for relationships in the data should be left to the artificial intelligence algorithm.

Results and discussion
The problem can be efficiently solved using a Fuzzy Neural Network -FNN. This type of artificial neural networks has an important specific feature -FNN can predict the result based on internal, non-obvious analytics of the source data parameters [3]. This feature has an invaluable advantage: if analysts do not fully understand the principle of how the source data is arranged and interrelated, then they can be analyzed using an algorithm. The algorithm determines data relationships and constructs a model based on them [4]. It is not possible to study the model itself and how the relationships in it are arranged [3], that's why the use of FNN is prohibited in tasks related to ensuring the life, health and safety of people [6]. But, since the subject area of the problem considered in the article is not related to ensuring the safety of life and health of people, the use of the FNN algorithm is permissible.
Taking into account the special orientation of the fuzzy neural network to independent data analytics, we note that FNN is, first of all, an artificial intelligence model that, in the learning process, also "makes" assumptions about the best way to form each stage of the decision algorithm architecture [3]. The process of forming a single stage in most cases is random, since a large analytical deepening into the data by the algorithm will require huge computing power and serious time resources [4].
It is not necessary to complicate the FNN with internal mathematical calculations so that the process of forming a stage is not random, but has a reasonable analytical character. A rational solution would be to introduce an auxiliary LGBM (Light Gradient Boosting Machine) algorithm into the FNN algorithm. This approach will increase the capabilities of FNN, while not requiring large time and computational resources [2]. The LGBM auxiliary algorithm allows each new stage of the architecture to be selected from those represented by the FNN algorithm based on the minimum loss function value and the best model accuracy value, rather than on the principle of assumption. Figure 1 shows a fragment of the application of the algorithm in programmatic form in the Python programming language. After the algorithmization of the auxiliary module for the fuzzy neural network, it is necessary to proceed to the application of the FNN algorithm ( Figure 2). After the implementation of the algorithm, it is necessary to determine what metrics will be used to evaluate its accuracy [4]. The key factors are the validity of the metric and its reliability. In the case of the FNN algorithm, these two factors are satisfied by the "inverse" validation loss metric [3]. The difference between the unity and this metric allows the most complete assessment of the accuracy of the artificial intelligence model.
The results of the model accuracy will be evaluated by the metric of validation losses (Figure 3), and cross-validation will act as the best epoch of model training (at which the highest generalization ability is achieved) [3]. The green block in Figure 3, outlines the values of the validation loss metric at each iteration of the training epoch of the algorithm. Epoch iteration is an attempt by the model to improve the internal structure of the algorithm on the training data during the training process [4].
The blue block in Figure 3 shows the state of the optimal algorithm configuration, characterized by the final value of the validation loss metric [3]. Thus, the most valid value of the algorithm model accuracy, which can be estimated by the validation loss metric, is the value equal to the difference between unity and 0.21466. The numerical value of this difference is 0.78534, which is a good indicator of the accuracy of trend forecasting algorithms [6].
The use of FNN algorithm ensembling with additional auxiliary LGBM module made it possible to obtain a really high accuracy of the model (Figure 3). But without a comparative comparison of the two types of algorithms (FNN with and without LGBM), it would be unfounded to argue that the use of LGBM is so necessary. Figure 4 shows the results of the accuracy of the FNN model, which were obtained without using the auxiliary LGBM algorithm.

Conclusion
From the received accuracy reports of machine learning algorithms (Figures 3-4), we can conclude that the difference in accuracy is significant and amounts to almost 0.2 (0.78534 vs. 0.59088). This gives grounds to argue that the ensemble of several types of algorithms really allows getting a gain in the overall performance of the applied artificial intelligence model for predicting volatility.
However, it should be noted that not in all cases this approach will allow to obtain a similar performance gain in the machine learning model, since in order to acquire the best approximation ability, it is often enough to use the basic design of the model, and additional add-ons can only worsen the initial accuracy indicators [2,4].