Method for constructing estimates of accuracy of measuring equipment based on Bayesian scientific approach

Before putting new unique samples of technical systems into commercial operation, as well as before introducing new technologies into production, as a rule, all kinds of tests are carried out. Small and very small volume of statistical data during testing is a characteristic feature of unique and small-scale products and technical systems. Therefore, the problem of constructing effective statistical estimates with a limited amount of statistical information is an important practical problem. The article proposes the development of the Bayesian approach to the construction of point and interval estimates of the parameters of the known distribution laws. The joint use of a priori and posterior information in the processing of statistical data of a limited volume can significantly increase the reliability of the result. As an example, we consider two most typical distribution laws that arise when testing new unique samples of measuring devices and equipment: normal distribution with an unknown average value and a known dispersion, as well as with an unknown average value and an unknown dispersion. It is shown that for these cases, the parameters of the distribution laws themselves are random variables and obey the normal law and gamma normal law. Recalculation formulas are obtained to refine the parameters of these laws, taking into account a posteriori information. If these formulas are applied several times successively, the process of self-learning of the system or self-tuning of the system occurs. Thus, the proposed scientific approach can find application in the development of intelligent self-learning and self-turning systems.


Introduction
The Bayesian scientific approach is widely used to create effective statistical estimates in various fields of activity [1][2][3][4][5][6][7][8][9][10][11][12]. Radio engineering, classification theory, machine learning, the creation of self-learning and self-tuning systems are just some of the areas where the Bayesian approach is effectively used. This paper describes the application of the Bayesian approach to the problems of constructing effective statistical estimates of accuracy of new and innovation measurement devises and instruments [13][14][15]. The algorithms for constructing the distribution density function of the average value and the distribution function of the mean square deviation are described. The developed algorithm is based on taking account the available statistical data together with, a priori information about the process or object under investigation. The results of the paper are used in determining the accuracy class of measuring instruments according to the results of acceptance tests, of state tests or tests directed to confirm the type of measuring device and instrument [13][14][15].

The logic scheme of Bayesian scientific approach
Suppose T s ) ,..., , (      -random vector-parameter that involved in the description of the distribution law, s -dimension of  . It is required to construct the best, in a certain sense, statistical estimation  of this vector of parameters from the available kdimensional observations ) ,..., , ( hereinafter, means the operation of transposing a vector. Uppercase letters will denote vector quantities, lowercase letters will be used to denote the one-dimensional (possible or observed) values of the random variables being analyzed, and we will denote matrices (vectors whose vectors are also components). A priori information is a probability distribution function of the analyzed unknown parameter. It is assumed that this information was obtained prior to the collection of statistical data. As new statistics become available, the distribution function is refined. Under certain assumptions, there is a transition from an a priori distribution to an aposteriori distribution using the Bayes formula [1] 1.
where the likelihood function ) ,..., , ( (2) according to the Maximum Likelihood Method (MLM) [1]. The general logical scheme of the Bayesian method for estimating the distribution parameter values is presented in Fig. 1. We describe in accordance with the scheme the main steps of its implementation. A priori information about the parameter  is based on the history of the functioning of the process under study, as well as on theoretical provisions about its essence and specificity. This a priori information should be presented in the form of function of density of a priori distribution law ) (  of the parameter  .
Let additional statistics appear in the measurement result ) ,..., , ( In accordance with the law of probability distribution . It is assumed that observations (2) with a fixed one are independent. The calculation of the posterior distribution is carried out using a formula (1) The construction of Bayesian point and interval estimates is based on knowledge of the posterior distribution law ) ,..., , ( (4) To calculation the Bayesian confidence interval or Bayesian confidence area for a parameter, it takes: A) In case of one-dimensional distribution law to calculate according formula (1) , where  -is the observed random variable: In [1] it was shown that the conjugate law for Problem 1 is the law of normal distribution with an unimportant mean value, and for Problem 2 the conjugate law is the gamma -normal distribution.
Note that the general form of the posterior distribution law ) ,..., , ( (1) is determined, accurate to the normalizing constant, only by the numerator of the right-hand side of this formula. Therefore, below when analyzing equalities that are accurate up to the normalizing constant we will use the sign "  ".
According the [1] for the Problem 1: The right-hand side of this relation is (up to a normalizing factor independent of  ), the density of the normal distribution with average value x and dispersion n / ) itself belongs to the class of normal distribution laws [1].
Note that for the Problem 2 the right-hand side is (up to a normalizing factor independent of  and h ) the density of two-dimensional gamma -normal distribution law Consequently, the set of conjugate a priori distribution laws of a two-dimensional parameter belongs to the class of two-dimensional gamma-normal distribution laws [1].

Method for calculation specific parameter values in conjugate priory distribution laws
Using as a priori laws the probability distributions associated with the observed general population allows us to determine their general form, i.e., it defines a whole set of a priori . Then the parameters of the a priori distribution law can be determined by the method of moments [1].
Since the calculation of parameters for the Problem 1 is obviously, we describe the procedure for calculating parameters only for the Problem 2. From the properties of twodimensional gamma -normal distribution law it follows that the partial a priori distribution of a parameter h is a gamma distribution law with parameters  and  . Therefore, using the given values of  For the Problem 1:

Method for recalculation of parameter values
Note that the average value 1 d and dispersion 2 d of a posterior normal distribution law are the weighted average values of a priori and sample mean and variances, respectively. For the Problem 2 when implementing the general scheme for converting a priori parameters into a posteriori parameters ones in this case, one should take into account the representation of the likelihood function L in the form (6); a priori density form of twodimensional gamma -normal distribution (6) too. And the vector of parameters is  Table 1).   Table 2 shows the point estimates and confidence intervals based on the Bayesian approach and the MLM. It can be seen that the application of the Bayesian approach allows one to construct more accurate and reliable estimates. Fig. 2 shows a general view of gamma -normal distribution law.  ( h   is 0,95 and 0,975, respectively. Note that with increasing n confidence areas will become more and more similar to ellipses, since the gamma -normal distribution will tend to a two-dimensional normal law. We also note that currently in the scientific works of other authors the methods for constructing confidence regions which have the shapes of ellipses, rectangles, ellipsoids are described and implemented.

Discussion
Modern innovative projects lead to the need to develop new measurement technical means and devices with specified technical, metrological and operational characteristics. The above characteristics of a new creating product are detailed in the relevant technical specifications of the development of a new product. Before the introduction of these technical means and devices or before the state acceptance of these technical means and devices, a whole test cycle is carried out. The goal of testing these devices is to confirm the specified characteristics defined in the technical projects. To achieve this goal the results of testing are carefully analyzed, including processed by statistical methods.
The algorithms and results obtained in the article are aimed at methodological support of the problems described above. The algorithms are based on taking account the available statistical data together with a priori information about the process or object under study.
It is important that the in the paper the obtained distribution function law has an analytical form. We note that in most practical problems it is possible to construct a distribution function only by numerical methods.
The developed in this paper method can be used to create self-learning and self-tuning systems. For this, it is necessary to consistently apply formulas (7) -(9).

Conclusions
The use of a priori information about the unknown value of the parameter and the application of the Bayesian approach made it possible to refine the estimates and, in particular, narrow the interval estimate in comparison with MLM from 1,5 to 2 times. Note that the developed Bayesian scientific approach can yield significant gains in accuracy with limited sample sizes compared to the traditional approach (MLMmethod). With an increase in the dimension of additional data or the arrival of a large number of different samples (statistical information) both approaches, due to their consistency, will give more and more similar results.