Mathematical model of bank scoring in conditions of insufficient data

. Recently, different methods of object classification using training datasets is actually. One of these methods is naive Bayesian classifier. Class of objects can consist of low number of elements. Such class is called poor class. In this paper we consider classification problem in poor class. Logical classifier doesn’t work in this case. Metric classifier can give good results if and only if there are quite dense set of metrically nearby classified objects in neighborhood of the considering object. Bayesian classifier reevaluates all hypotheses about belonging of the object to certain class. Therefore, Bayesian classifier can solve this classification problem. For example, we considered classic problem of bank scoring. This scoring is based on two criteria. Classified object has two belonging hypotheses. We can apply such reasoning for more difficult cases.

Bank scoring is a procedure of evaluating a borrower's credit rating. As a result of this, the bank decides whether to issue a loan to the borrower. In the process of operation, the bank accumulates information about loans previously issued to this or that borrower thus making an extensive data frame that contains values characterizing the borrower. The target function here can show two possible values: 1) the borrower returned the loan, 2) the borrower did not return the loan.
We can image this data frame as table 1.
Based on the training data sample (the first ݉ rows of the data frame), the probability that the(݉ + 1) borrower will repay the loan is calculated. If this probability is too low, the loan will be refused, if it is high enough, the loan will be issued. The loyalty threshold is set by the bank depending on its credit policy.
In terms of data analysis, bank scoring is a typical classification task. Classification procedures are well studied and described by both Russian (see [1][2][3][4][5]) and foreign researchers (see [6,7]). The authors of this paper consider two versions of the probability classifier.

Classical Bayesian classifier
Bayes formulas are well known as a part of the initial course of the theory of probability. They demonstrate the relationship between a priori and a posteriori probability of a set of hypotheses. More specifically, if ‫ܪ‬ ଵ , ..., ‫ܪ‬ is a set of hypotheses, then after the condition‫ܣ‬ is satisfied, the new probabilities of the hypotheses are calculated while using the following formulas: (1) If the two conditions ‫ܣ‬ and ‫ܤ‬are satisfied, then the posteriori probabilities of the hypotheses are evaluated similarly: . (2) It is clear that there can be many conditions of this kind. In data analysis tasks, meeting a set of conditions means that the features represented in the data frame take on specific values: In this case, the class of objects that meets these conditions may be very poor or even empty. Therefore, the classical Bayesian classifier is rarely used in classification problems.

Naive Bayesian classifier
Let us consider the naive Bayesian classifier algorithm for the case of two conditions. First, the classical Bayes formula is applied: The right part contains a fraction, so the left part is proportional to the numerator of the right part. Using the ∝symbol to denote proportionality, we get: Now let us apply the so-called "naive assumption": we assume that the events ‫ܣ‬ and ‫ܤ‬ are independent. Then: Let us perform this operation for all hypotheses and get the proportionality of the sum: The latter relation means an exact equality with some unknown coefficient ߙ: The coefficient is determined from the condition after that, the posteriori probability is calculated: The classical Bayesian classifier makes it possible to re-valuate one single hypothesis without appealing to the posteriori probabilities of other hypotheses, while the naive Bayesian classifier can only re-valuate all hypotheses at once.
In practice, the "naive assumption" is usually not fulfilled: it means that the value of one attribute accepted by an object somehow affects the values of other attributes. Nevertheless, when solving practical problems, the naive Bayesian classifier often shows better results than the classical one.

Examples
Example 1. Let us consider a data frame that contains information if the borrower has returned the loan (the Score attribute), as well as information about his gender identity (the Sex attribute) and whether he has a criminal record (the Crime attribute).
The basic information is presented in Table 2, Tables 3 and 4 differ from Table 1 only by permutations of rows, the data there are grouped for perception convenience.   Let us calculate the probability of credit repayment by the payer belonging to the class of "convicted male".
Example 2. Let us again look at the data on borrowers, but this time, let the class of "convicted male" be very poor and contain only one representative (see Tables 5, 6 and 7).  Let us calculate the probability of credit repayment by the payer belonging to the class of "convicted male".
From Table 7 we get: ‫܍ܚܗ܋܁(ܲ‬ = "+") = 10/16, and ܲ (Sex = "Male", Crime = "Yes" | Score = "+") = 1/10, from Table 6  The same result can be obtained directly from Table 6. However, the result obtained contradicts common sense, since it is well known that a man is a less responsible social counterparty than a woman, and a convicted person is a less responsible social counterparty than a non-convicted person. At the same time, the formal calculations given above show that if the borrower is a man, and if he has a criminal record, then he will return the loan for sure. This is absurd.