Statistical analysis of the electric energy production from photovoltaic conversion using mobile and fixed constructions

The paper presents the most representative – from the three-year measurement time period – characteristics of daily and monthly electricity production from a photovoltaic conversion using modules installed in a fixed and 2-axis tracking construction. Results are presented for selected summer, autumn, spring and winter days. Analyzed measuring stand is located on the roof of the Faculty of Electrical Engineering Poznan University of Technology building. The basic parameters of the statistical analysis like mean value, standard deviation, skewness, kurtosis, median, range, or coefficient of variation were used. It was found that the asymmetry factor can be useful in the analysis of the daily electricity production from a photovoltaic conversion. In order to determine the repeatability of monthly electricity production, occurring between the summer, and summer and winter months, a non-parametric Mann-Whitney U test was used as a statistical solution. In order to analyze the repeatability of daily peak hours, describing the largest value of the hourly electricity production, a non-parametric Kruskal-Wallis test was applied as an extension of the Mann-Whitney U test. Based on the analysis of the electric energy distribution from a prepared monitoring system it was found that traditional forecasting methods of the electricity production from a photovoltaic conversion, like multiple regression models, should not be the preferred methods of the analysis.

During the processes of design and execution of power supply systems from renewable sources of energy various types of needs and requirements have to be considered. As the energy is generated and supplied to the recipients -that is currents and voltages occur in the elements of the system -the occurrence and impact of electromagnetic fields should be taken into account. This is connected both with the correctness of operation of all the elements of the system (associated with generation, processing and transport of energy) as well as the impact on the environment (ecological reasons), but also rationalization of energy management. It is important to consider these impacts on various levels: both high current systems (connected with energy generation and supply) and microstructures (photovoltaic conversion of energy or operation of electronic elements) [16][17][18][19][20][21][22][23][24][25][26][27].
All human activity is associated with optimization behaviours and considerations. People try to achieve their goals in a way most beneficial for them. This is associated with a great variety of optimization criteria, methods of creating criterial functions and the used methods, which depend on a given task [4,5,22,[28][29][30][31][32][33][34][35][36].
The study discusses optimum energy yield achieved from the photovoltaic conversion for the PV modules installed in a fixed and two-axis tracking construction. Due to strongly stochastic character of obtaining energy (sunlight depending on seasons of the year, weather conditions, etc.), the results of the conducted tests were subject to statistical processing, which constitutes the main part of this article [4,5,[37][38][39][40][41][42][43]].

Parameters of Statistical Analysis
The study presents the exemplary courses of daily variation of electrical energy production with the use of photovoltaic modules in the fixed and two-axis tracking construction for the selected day of the summer, autumn, spring, and winter month. On the basis of the analysis of daily distribution of values of the registered electrical energy, it is possible to indicate the days with almost symmetrical shape of the course of this value in relation to the afternoon hours and the days characterized by strong irregularities. The skewness coefficient described by the following relation was used as a measure of asymmetry [43]: The higher value of the skewness coefficient, the bigger asymmetry in relation to the average value. The asymmetry factor equalling zero indicates the symmetry of the variable's distribution, the positive value indicates right side asymmetry, while the negative value left side asymmetry.
The coefficient of variation vis the quotient of variation of a given value around the average value from the population (standard deviation of the population) to the designated average value. It is assumed that if the determined coefficient of variation does not exceed 10%, the features indicate statistically insignificant variation [42]. Big values of standard deviation in relation to the average value may lead to the limitation of the quality of the forecasting model.
Kurtosis is a measure of the flatness or peakedness of the obtained distribution in relation to the normal distribution. Its value equals zero for the theoretical distribution.
Repeatability of the monthly electrical energy production from photovoltaic conversion was defined with the use of the non-parametric Mann-Whitney U test. The measure of central tendency of the test is median. Zero and alternative hypothesis may be described as follows: • H 0 : distribution of average ranks of observations in the analysed groups does not differ significantly -the samples come from one population (the medians of the tested variable in both groups do not significantly differ between one another); • H 1 : distribution of average ranks of observations in the analysed groups differs significantly -the samples come from different populations (the medians of the tested variable differ between one another).
Depending on the population's size, it is possible to calculate the value of the test statistics and on its basis the probability value p, which is compared with the significance level α from the distribution tables, using the following relation [43]: § for small size of the population: where: n 1 , n 2 -respective sizes of the first and second population; R 1 , R 2 -sums of ranks of the elements from the first and second population; § for big size of the population: where: t -number of cases included in the tied rank.
The choice of the right hypothesis can be presented with the following relation: Repeatability of the daily peak hours in which the highest value of hourly electrical energy production from photovoltaic conversion is observed was described with the use of the non-parametric Kruskal-Wallis test. This test is used to verify the hypothesis about lack of significance of the differences between the medians of the tested variable in a few populations (k > 2). It should be assumed that distributions of the measured value are close to each other.
The zero and alternative hypotheses can be recorded as follows: • H 0 : distribution of average ranks of observations in the analysed k groups does not differ significantly -the samples come from one population (the medians of the tested variable in all groups do not significantly differ between one another); • H 1 : distribution of average ranks of observations in the analysed k groups differs significantly (the medians of the tested variable differ between one another in individual groups).
The value of test statistics H can be determined using the following relation [41]: where: n j -size of the populations for (j=1,2,…,k), R i,j -ranks assigned to the variable value for (i = 1,2,…,n j ), (j = 1,2,…,k), t -number of cases included in the tied rank.
The probability value p determined on the basis of the test statistics H should be compared with the assumed significance level α.
The choice of the appropriate hypothesis can be presented with the following relation:

Results of Statistical Analysis
The selected results of statistical analysis of daily distribution of electrical energy production for the recommended days of the summer, autumn, spring, and winter months, including determination of the average value, standard deviation, skewness and coefficient of variation for the photovoltaic module installed in the fixed (us) and two-axis tracking construction (un) were presented in Table 1. The method of determination of the recommended days of the year with their calculation for the typical meteorological year of Poznan was presented in [38]. In the analysed case the value A s for the groups of days of the summer month (03.07.2014, 12.07.2014, 26.07.2014) is close to zero, indicating almost symmetrical distribution. For winter months, the increase in the asymmetry factor both for the value of electrical energy produced by the photovoltaic module in the fixed and two-axis tracking construction can be observed. The highest values of this parameter, exceeding 2, were observed for the winter months (08.01.2015, 15.01.2015, 18.01.2015) with strongly random distribution of the surface density of the solar power radiation. The symmetry of distribution of the daily production of electrical energy determines its bigger predictability, which makes it easier to make a decision on the method of installation of photovoltaic modules in the photovoltaic system and to determine the possibility of covering the daily energy demand of the powered objects during morning to evening hours.
In order to determine the repeatability of the courses of the monthly electrical energy production from photovoltaic conversion occurring between summer months and summer and winter months, a nonparametric Mann-Whitney U test was used, which consisted in comparing two independent groups. In this case the normal distribution of quantitative variables and the equal size of the groups is not required. It is difficult to meet the last condition in the analysed cases due to various numbers of days in individual months of the year. Also, the used non-parametric method is relevant for populations of small size, where variables are measured in a quantitative, ordinal or dichotomous scale. The observations with equal value in the created ordered series were assigned ranks -also tied ranks. The detailed results of the Mann-Whitney U test obtained in Statistica software for June, July, August, December, and January are presented in Table 2.  The determined value of test probability p indicates that while comparing the summer and winter months (June -January, July -December), the assumed significance level α has bigger values. Due to that there are grounds to reject the zero hypothesis and to assume the alternative hypothesis according to which there are significant differences in the monthly distributions of electrical energy production from photovoltaic conversion for both periods. This can also be confirmed in the results of monthly distribution of electrical energy production for the installed fixed and tracking photovoltaic modules for the indicated time periods. Figure 1 and Figure 2 present the distribution of electrical energy obtained from photovoltaic modules installed in a fixed and 2 -axis tracking configuration.  In the case of summer months (June -July, June -August) no grounds to reject the zero hypothesis were identified. The monthly distributions of electrical energy do not significantly differ between one another. Also, no statistically significant differences were demonstrated for the two winter months: December 2014 and January 2015.
In order to analyse the repeatability of the daily peak hours in which the highest value of hourly electrical energy production is observed, respective 60minutes periods of the day were indicated for each day of the analysed month: May, June, July and August 2014. Comparison of four independent groups was conducted using the non-parametric Kruskal-Wallis test which is the extension of the Mann-Whitney test. Both tests do not require to meet many assumptions characteristic for parametric tests. The statistical significance of the Kruskal-Wallis test indicates there are significant differences between the groups in relation to the repeatability of peak hours.
The value of parameter p = 0.3279 determined as a part of Statistica software is higher than the assumed significance level α = 0,05. Hence, there are no grounds to reject the zero hypothesis assuming lack of significant differences in the periods of peak hours for the analysed months of the year. Therefore the occurring differences are not statistically significant. The additionally conducted Mann-Whitney U test for two groups confirmed the occurrence of significant differences between the peak hours (average ranks of observation) for the summer month (June 2014) and winter month (January 2015) with the assumed significance level α = 0.05. The results of the tests are presented in Table 3.

Conclusions
The used non-parametric methods are particularly relevant for small populations. Both the Mann-Whitney U test and non-parametric Kruskall-Wallis test demonstrated that there are significant differences between the values of the registered daily electrical energy and instantaneous power from the photovoltaic conversion for summer and winter, summer and autumn and summer and spring months. However, no statistically significant differences between the variable measured for months of the same season of the year were identified with the assumed significance level.
The analysis of variation and repeatability of monthly electrical energy production for almost the whole measurement year was also completed to include determination of, among other things, monthly average value, standard deviation, median, range and coefficient of variation.
Both the quantitative and the qualitative comparative analysis of the monthly electrical energy production from photovoltaic conversion indicates lower variation of courses for the summer months. Basic parameter in the presented analysis is the coefficient of variation in the meaning of ratio of standard deviation to arithmetic mean. The coefficient of variation value changes within the range from 0.36 -0.51 for the months from June to August 2014. This is almost fivefold lower value compared to the results obtained for December the same year and over threefold lower compared to January the following year. The value of coefficient of variation exceeding 0.60 indicates that a significant percentage of the average value is its standard deviation. Hence, the distribution of the measured value is not homogeneous and consequently the arithmetic mean should not be the main statistic measure. Using only the value of the standard deviation in the comparative analysis may turn out to be insufficient due to considerable differences in the average value of the monthly production of electrical energy for the extreme months of the measurement year.
The determined median value with considerably higher values for the summer months indicates that half of all the observations is found respectively below or over its value in the ordered series. The range has comparable values for the months from the same period (season of the year). However, this is not the measure resistant to the outliers which can disturb the variation of the measured value, at the same time not resulting in the change of the range's value.
The conducted statistical analysis may be useful for evaluation of energy potential of central Poland represented by the city of Poznan, with regard to planning new investments in the sector of renewable energy resources, mainly using technologies based on conversion of solar radiation energy to electrical energy or heat.