An automated method for detecting sporadic effects in cosmic rays

The paper proposes an automated method for analyzing data from neutron monitors and detecting sporadic effects in the dynamics of cosmic rays. The method is based on the use of LVQ neural networks and wavelet transform constructions. It is shown that the method allows detecting sporadic effects of different amplitudes and durations and evaluating their parameters. A numerical implementation of procedures for detecting sporadic effects and assessing their intensity is carried out. The questions of choosing the parameters of algorithms are investigated and ways of their optimization are proposed. On the example of the April 13-14 2013 and March 8-9 2014 events, the effectiveness of the method for detecting sporadic effects in cosmic rays preceding and accompanying magnetic storms is shown.


Introduction
The explore of cosmic rays (CR) is of interest in the study of astrophysical processes, as well as in solving many practical problems, including monitoring and forecasting space weather, providing the radiation safety of cosmonauts [1]. The intensity of secondary cosmic rays depends on the temperature and air pressure, the geomagnetic coordinates of the observation site and the direction of arrival of the flux, the state of the geomagnetic field and the electromagnetic environment in the Solar System, as well as physical conditions in the Galaxy [2,3]. At present, it is not possible to solve the problem of operational and mathematically accurate prediction of space weather [1]. To predict space weather on-line, it is necessary to create automated methods for analysing the recorded data of cosmic rays and timely detection of sporadic effects. Sporadic effects include Forbush effects [4] and large proton increases in Ground Level Enhancement (GLE events). The Forbush effect is a decrease in the intensity of cosmic rays, its recovery, as well as minor changes in CR dynamics before the onset of magnetic storms. Ground level enhancements (GLE) of solar CR are the maximum proton increases that pose a serious radiation hazard and a danger to human health and life.
At present, data from the world network of neutron monitors are used to study the dynamics of cosmic rays [5]. The secondary cosmic rays recorded in this way represent a complex nonlinear dependence containing, in addition to useful information, noises of various nature [3]. Incomplete knowledge of processes in near-Earth space and their interaction significantly complicates the process of constructing models and methods for analysing data from neutron monitors. The applied classical methods [6][7][8] make it possible to single out stable characteristics of the data, but are not effective enough for studying non-stationary changes in cosmic ray variations. Modern methods [9] make it possible to determine the main characteristics of the dynamics of the cosmic ray flux with acceptable accuracy, but they require complex computational calculations, as a result of which they are not automated. It is also very important to detect sporadic effects of low amplitude, which can occur on the eve of magnetic storms and serve as their predictors. Detecting small Forbush effects is a complex task, requiring a highly qualified expert and an extensive network of observations [10]. Effective detection of low-amplitude Forbush effects is currently not realize. Due to the lack of information about the CR data structure [10], as well as the processes occurring in near-Earth space, it is proposed to use the neural networks of vector quantization LVQ [11,12] and wavelet transform [13] for the analysis of neutron monitor data. The advantages of neural networks are the ability to solve problems with unknown regularities and dependencies, to identify parameters that are not informative for analysis and to filter them out, adapt to the changing nature of the signal, and also have high speed and are automated [14,15], which is very important for operational analysis of space weather [16 -18]. The wavelet transform is widely used to study data of a complex structure, it allows you to select informative components and suppress noise [19]. The use of a neural network in the work is used to assess the state of the cosmic ray flux according to the data of neutron monitors. To improve the efficiency of neural networks, it is proposed to use a data preprocessing procedure based on the construction of orthogonal multi-scale analysis. In order to detect and evaluate the intensity of Forbush effects of different amplitudes and durations, continuous wavelet transform and threshold functions are used in the work. The approach used was first proposed in [20], a detailed description of the method and an estimation of its effectiveness are presented in the article [21]. This paper explore the problems of choosing the parameters of algorithms for the implementation of the method and proposes ways of their optimization.

Pre-processing of neutron monitor data based on orthogonal multi-scale analysis
Neutron monitor data contain noise caused by equipment errors and interference from the effects of nature [3,4]. To suppress noise, data preprocessing was performed based on the procedure of orthogonal multiple scale analysis (MSA) [13,14]. Application of MSA to the level of decomposition m allows to represent the time series in the form of orthogonal components of different scales: , (1) where − -detailing (high-frequency) components, − -smoothed component.
The choice of the decomposition level is discussed below, in the experimental part of the work (see Section 3).
To restore the function initial resolution, we realize the operation of wavelet reconstruction: where the superscript (-m) corresponds to function resolution before the wavelet reconstruction operation.

Estimation of the state of the cosmic ray flux based on the LVQ neural network
LVQ network consists of 2 layers: a Kohonen, and a linear, determining the correspondence between clusters k and appropriate classes l [11][12] where wkl are weight coefficients of the neuron l of the network second layer associated with the neuron k of the network first layer, yk is the output value of the neuron k of the network first layer. Input vectors are clustered based on the operation: where X is the output vector; Wk is the weight vector of the neuron k of the first competitive layer, I is the output vector dimension. During the process of NN operation determining a winning neuron p for which The output vector of the network has a dimension equal to the number of classes L (in the work L = 3, the description of the classes is given below). One element of the output vector is equal to one, the rest are equal to zero. Thus, the network allows solving the problem of the input vector belonging to one of the a priori known classes. According to the task and following the paper [21], the following classes of neural networks were determined: 1. "Calm" class -absence of sporadic effects in CR. The data were selected taking into account: (1) -absence of active spots and flares on the Sun (zero flare activity); (2) -absence of solar wind flux from the visible side on the line with the Earth; (3) -absence of magnetic storms and disturbances in the magnetosphere. 2. "Weakly-disturbed" class -presence of sporadic effects of low amplitude. The data were selected taking into account: (1) -insignificant solar flare occurrences directed to the Earth; (2) -presence of weak disturbances in the magnetosphere. 3. "Disturbed" class -presence of sporadic effects of high amplitude. The data were selected taking into account: (1) -penetration of disturbed high-velocity fluxes of solar wind and/or a shock wave associated with it to the Earth vicinity; (2) -magnetic storm and strong disturbances in the magnetosphere. In order to improve the efficiency of the neural network, the preprocessing of the input vectors of the network was carried out on the basis of MSA (see section 2.1).General scheme for the problem solution is illustrated in Fig. 1. Fig. 1. Scheme for problem solution.

Detection and estimation of the intensity of sporadic effects based on continuous wavelet transform
A continuous wavelet transform was applied to detect sporadic effects [13,14]: Then, the detection of sporadic effects was performed based on the following threshold function: where Ψ − , -is the median value calculated in a moving time wind of the length . = * -is the threshold, The intensity of sporadic effects Eb in the case of an anomalous increase in the data of neutron monitors will be positive, and in the case of an anomalous decrease -negative. The blockscheme of the algorithm that implements operations (2) -(4) is represent in the application.

Experimental results and discussion
We used minute data from neutron monitors at Thul and Inuvik stations [5,22]. To perform orthogonal multi-scale analysis (see operation (1)), Daubechies wavelets of order 2 were used. The rationale for the choice of the wavelet and the algorithm for determining the best approximating wavelet for the decomposition level are presented in the article [21]. In this paper, in order to optimize the method and the algorithms that implement it, the problems of choosing the level of decomposition at the stage of data preprocessing (see Section 2.1) and determining the threshold coefficient U when performing the operation of detecting sporadic effects (see Section 2.3, operation (3 )). Due to the lack of a priori information about the structure of the useful signal, the a posteriori risk [23] was used to determine these parameters: where Пjl -loss function, P{x ϵ Xl/h i j} -conditional probability, states h i j characterize the availability/absence of a sporadic effect of class i. Averaging the conditional risk function over all states h i j, we have the average risk: where pi -the prior probability of the state h i j. Given the lack of knowledge about the prior distribution of states pi, a simple loss function was used to estimate the posterior risk: The performed estimates of the posterior risk showed that when performing MSA (operation (1)) the level of decomposition allows minimizing the risk (5). The smallest error of the Algorithm for detecting sporadic effects (see section 2.3 and the Application) is provided by the threshold coefficient U = 2.5. Figures 2 -4 show the application of the Algorithm for detecting sporadic effects in the period from 7 to 11 March 2014. According to the space weather forecast [24], the analyzed period is characterized as calm: the solar wind speed (SWS) has been at the level of 400-500 km/s since March 7, and by March 10 it has gradually decreased to 250 km/s ( Fig. 2 (a) ), the flare activity is moderate, the southern Interplanetary Magnetic Field (IMF) component did not exceed Bz=-5 nT (Fig. 2 (b)), the geomagnetic field, according to the data of mid-latitude and high-latitude stations, is very quiet throughout the entire period (Ap index did not exceed 5, Kp-index -2 [25]). The flare was accompanied by a burst of radio emission at a wavelength of λ=10.7 with an intensity of F=110 f.u. The data processing results show on March 7 at about 13:00 UT at Inuvik station ( Fig. 2 (g), (h)) and at about 11:00 UT at Thul station ( Fig.  2 (j), (k)) an increase in the intensity level CR, which is consistent with a sharp increase in SWS ( Fig. 3 (a)) and an increase in the K-index (K-index = 2) (Fig. 2 (d)). According to the estimates of the intensity of sporadic effects (operation (4)) on March 8 at 22:00 UT at Inuvik station ( Fig. 2 (g), (h)) and at about 09:00 UT at Thul station ( Fig. 2 (j) , (k)), a decrease in the intensity of the cosmic ray flux occurred. Then, on March 9, at 12:00 UT at Inuvik station and at 10:00 UT at Thul station, the flux intensity began to increase smoothly, the continuance of the anomalous change was about 8 hours. Note that, according to the Izmiran Space Weather Forecast Center [26], the Forbush Effect of small amplitude was recorded on 2014.03.09 with the beginning at 18:00. The results obtained confirm the effectiveness of the proposed algorithm and show the possibility of its application for the isolation of sporadic effects of small amplitude. Examples of performing operation (1) for different levels of decomposition are shown in Figure 3. Figures 3 (b), (c) show the results of the Algorithm m=0 (without the use of MSA). Note that anomalous changes in the signal (increases/decreases) have blurry spectral images (see Fig. 3 (b), and the intensity curve is greased and does not allow us to accurately record the moments of occurrence of sporadic effects (Fig. 3 (c)), which is especially expressed for the periods 09.03.14 from 00:00 UT to 6:00 UT and from 13:00 UT to 23:00 UT). The results of the application of the MSA operation for the level of decomposition m=1 (see Fig. 3 (e)) show its effectiveness -the detection of Forbush effects (from 08.03.14 22:00 UT to 09.03.14 23:00 UT at the st. Inuvik and from 8.03.14 09:00 UT to 9.03.14 17:00 UT at the sta. Thul) with an increase in the value of Eb by 1.2 times. This result indicates a decrease in the noise level in the signal and an increase in the efficiency of the algorithm due to the use of MSA. The use of MSA for the decomposition levels m=2,3 ( Fig. 3 (g) and Fig. 3 (i)) also enhances the detection capabilities of the algorithm -increasing magnitude, clearer spectral images and an intensity curve. The subsequent application of MSA (for the level of decomposition m=4, Fig. 3 (j), (k)) does not improve the quality of the detection operation and indicates that the main noise in the signal is suppressed, and further filtering can lead to the loss of significant information.  Figure 4 shows the results of the algorithm applying different threshold coefficients U (see operation (3)). Figures 4 (g), (h) and 4 (e), (f) show the results of the algorithm with the value of the threshold coefficient U=1.5 and U=2, respectively. The threshold coefficient U=1.5 allows you to detect variations in the data exceeding the characteristic level of 1.5 or more in amplitude, and the threshold coefficient U = 2 allows you to detect variations in the data exceeding the characteristic level of 2 or more in amplitude. Analysis of the results shows that when using the threshold coefficient U = 1.5 ( Fig. 4 (g), (h)), fluctuations in the intensity of cosmic rays associated with the diurnal variation are detected (signal/noise= 0.47). The use of the threshold coefficient U = 2 (Fig. 4 (e), (f)) allows better detection of anomalous changes in the cosmic ray data (signal/noise= 0.54), but daily variations are also present. The best results are shown in Figure 4 (c) (d) -signal/noise=0.59 and absence of daily variations in the processing results. Thus, this example confirms the correct choice of the value of the threshold coefficient U = 2.5, including for the detection of Forbush effects of small amplitude.   (3), for U=2, f) operation (4) for U=2, g) operation (3) for U=1.5, h) operation (4) for U=1.5. Fig. 5 illustrated the results of neutron monitor data processing at Inuvik station during increased solar activity (from April 08 to April 22, 2013). According to space weather data [32], the SWS at the beginning of the period under consideration was recorded at 350 km/s. The IMF southern component was Bz = +4 nT (Fig. 5d). Due to the accelerated flow from the coronal hole on April 11, the SWS increased to 500 km/s, and on April 13 it decreased to 400 km/s. The accelerated flux from the coronal mass ejection reached the near-earth space at the end of April 13, and the SWS instantly increased to 560 km/s, the IMF Bz component decreased to -7 nT (Fig. 5d). The results of the Algorithm for detecting sporadic effects show at the beginning of the day on April 13 the beginning of an anomalous increase in the CR intensity (Fig. 5 b, c), the continuance of the anomaly was about 12 hours. A few hours before the event, the CR level began to increase, and then the Forbush effect of large amplitude appeared (Fig. 5 c) .Based on the results of studies [21,27,28], this example shows a characteristic change in CR dynamics both on the eve and during a magnetic storm. The results of the operation of neural networks (Fig. 5 g, h) confirm abnormal changes of small amplitude ("Weakly disturbed" class, see Section 2.2) in the dynamics of CR on April 13, during the event, the state of the CR flux became disturbed ("Disturbed" class), which characterizes the emergence of a large sporadic effect. Comparison of the results of the neural network without the use of MSA (Fig. 5 g) with the results of the network implemented according to the scheme shown in Figure 1 (Fig. 5 h) confirms the efficiency of the MSA operation at the stage of data preprocessing. These anomalies are reflected in the results of threshold algorithm (operations (3), (4) Fig. 5 b, c), and in the results of NN application (Fig.  5g, h). On April 16, the solar wind parameters returned to unperturbed values. Until the end of April 23, the SWS was 350 km/s, the IMF southern component was Bz = +4 nT (Fig. 5 d). Neural networks also classified the end of the period as "Calm" (Fig. 5 g, h). The results of the method based on the application of LVQ neural networks and wavelet transform constructions have shown its effectiveness in detecting sporadic effects in cosmic rays. The possibility of detecting sporadic effects of small amplitude, which can serve as predictors of deep, large-scale Forbush effects and strong magnetic storms, has been experimentally confirmed using the events of March 8-9, 2014 and April 13, 2013. The software implementation of the method allows it to be used on-line, which is important when performing space weather forecasting. The high detecting properties of the Sporadic Effects Detection Algorithm make it possible to perform a detailed analysis of the dynamics of cosmic rays using data from different stations. In order to optimize the method, a method for choosing the level of the orthogonal multiscale expansion at the stage of data preprocessing and determining the threshold coefficient when performing the operation of detecting sporadic effects is proposed and proved. In the future, the authors plan to continue research in this direction with the use of a wider range of cosmic ray data recording stations and an increase in statistical material.

Application
In fig. 6 shows an algorithm for detecting and estimating the intensity of sporadic effects in cosmic rays. The first operation described in the enlarged block, representing a continuous wavelet transform, is not described in detail, due to the elimination of the block-scheme overload and the redundancy of the paper material. Also, some operations are indicated in the Matlab programming language. Fig. 6. Block-scheme of the algorithm for estimating the intensity of sporadic effects in CR.