Order logistics based on collaborative filtering

. Modern approaches to the organization and management of logistics operators include issues directly related to the forecasting of the dynamics of the processes of the company under study, as well as the development of recommendations for further functioning in the market of goods and services. Services integrated into the order and delivery logistics system are an integral part of it and accompany the flow processes throughout the supply chain. Nowadays, the complex of information technologies, communications and technological solutions allows implementing the most daring offers of logistics companies. Innovative solutions developed using information technologies develop the complex “design-practical implementation” in logistics systems and form a digital space for managing material flows. The calculation of the forecast values of the indicators that characterize the order requires research from the point of view of taking into account the already known preferences of customers. We are talking about recommendation systems that offer to form a package of services in the delivery and assembly of a batch of orders based on the established preferences of other customers and users of the services. The well-known preferences of customers when accessing logistics services form the basis for developing recommendations that are close to them. The obtained data of forecasts of the state of the system under study should provide an opportunity to develop an individual package of solutions, based on the already known preferences of other users. The key value is the information that is used in this model to build forecasts. The practice of finding solutions in the management of order procedures in logistics is an analysis of the key problems and “problem points” that affect the overall performance indicator, taking into account the geography of transport systems, established trade practice, the characteristics of the markets studied and the location of each, specifically selected company.


Introduction
The formulation of the supply logistics functional areas includes the procurement organization at the enterprise and determines the direction in the material flow, solving the problems of organizing purchases, supplies and physical distribution of goods. The main goal of supply logistics is the rational organization of the supply of material resources in accordance with the known rules of logistics. Terminology of procurement logistic connected with the awareness of issues in the market analysis of necessary goods and services, the process of selecting a supplier, the procedure for agreeing on the main terms of delivery and, ultimately, the very organization of transportation of material assets.
To strengthen competitive positions, it is necessary to search for modern digital solutions and form digital platforms, to justify a package of services in the market of goods and services. This interpretation of the urgent tasks facing logistics operators today boils down to the development of high-quality and accurate forecasting methods in the order management in transport and logistics systems. The theoretical basis and practical recommendations for determining promising directions in the set of the list of ordered goods and, in part, the package of services for their delivery in transport and logistics systems in order to minimize the total costs of the transport and logistics system. Of great importance is the digital component and intelligence, which have formed the vector of targeted development and development of promising solutions in almost all areas of science and technology, including in the management of transport and logistics systems. The development of information and communication network technologies, online orders is faced with the problems of recommendations for the offer of goods, services, and the formation of a set of services for users of this information system [1]. The solution of these problems determines the relevance of research in the field of practical use of existing solutions in the development of recommendation systems that answer the questions of forming the optimal assembly of a batch of orders based on the preferences of other customers and users of the services. These issues become particularly relevant with the development of information and communication systems, when the known preferences of customers when accessing logistics services form the basis for developing recommendations that are close to them. The obtained predictive estimates of the state of the system under study should provide an opportunity to develop an individual package of solutions, based on the already known preferences of other users. This is the key idea of using collaborative filtering as a scientific and practical tool for developing recommendations [2].
Collaborative filtering based on neighborhood.
Step 1: for each user, a measure of its proximity to user a is calculated with the determination of the Pearson correlation coefficient wa,u (1): where w a,u -a measure of the closeness of users a and u; I -a set of objects rated by both user a and user u; r u,i , r a,i -evaluation of object i by users u and a, respectively; r̅ a , r ̅ u -the average score of user a and u, respectively. Step 2: for each user, a measure of its proximity to user a is calculated The cosine between the vectors is determined by the formula (2): w a,u = cos( r ⃗ a , r ⃗ u ) = r ⃗⃗ a ×r ⃗⃗ u ‖r ⃗⃗ a ‖ 2 ×‖r ⃗⃗ u ‖ 2 Step 3 involves calculating the rating of the object based on the ratings of the selected "neighbors" -formula (3): where p a,i -estimation prediction; w a,u -measure of proximity between users a and u (1 step); r a ̅ ,r u ̅ -the average score of user a and u, respectively. Next, we will consider the collaborative filtering procedure based on the proximity of objects. This procedure includes the following sequence of steps: Step 1: for each object j, a measure of its proximity to object j is calculated (4): where U -lots of users who rated i and j; r u,i -evaluation of object i by user u; r̅ i -average rating of the object i; Step 2: the set of objects is selected closest to the object i.
Step 3: Rating of an object based on the ratings of objects close to it is predicted (5): where K -many items that are "close" to i; w i,j -a measure of proximity between i and j.
Collaborative filtering based on the model of Naive Bayesian Classifier algorithm (6): A fixed object classSet = {1,2,3,4,5} is considered UserSet -a lot of users who rated this object p(U a = j) -probability of class (score) j for user a P(U = r |U a = j)-the probability that user z has rated r (taken from the rating matrix), provided that user a has rated j. Laplace's estimate is created by the formula (7): # -the number of objects that have the corresponding user ratings. The presented analysis of the mathematical models on the basis of which the idea of calculating collaborative filtering is formed includes other approaches that will be analyzed and reflected further.

Materials and Methods
When solving the problem of calculating the economic size of the order (the optimal order size), the costs of acquisition, storage, delivery and losses from the shortage of products are taken into account. According to a number of researchers, this formulation does not contain "hidden" costs and a value that takes into account the interdependence and mutual influence of current and insurance reserves [3].
Most often, the formula is used to determine the total costs ∑ С = + + + + where СКcosts related to purchases, rub., СR -the costs associated with the reserve, rub., СS -storage costs, rub., СSH -costs related to the absence of goods (shortage), rub., СP -price per unit, rub., The costs for the CC cycle are determined by the formula where СО -costs for order, rub., S -average reserve quantity (current reserve), units., h -storage costs, rub.\un.day, Т -cycle time, day. (10) where  -is the demand intensity S -the desired order value, units. Based on the expressions set out above the storage costs are defined as follows The costs per cycle are determined by formula: Next, we proceed to the calculation of the optimal order size The demand for the ordered product A during this time period is determined where DР -number of working days, units Storage costs Then the optimal order value will be determined by We determine the number of orders Based on the specified, cycle time is calculated The minimum total costs are determined by If the cost of the order is determined by the sum of the cost of transactional search of the customer (all for signing the contract) and transport, then the cost of storage can be determined where f -share related to storage costs, %. If the product is in storage, then this is one value of the value, but in most casesmovement, recycling, processing-the value is different.
Taking into account the lease of storage facilities, the cost calculations must be determined where  -the cost of storing a unit of production, taking into account the occupied area, rub.\m 2 , k -coefficient that takes into account the spatial dimensions of a unit of production, m 2 \unit With this in mind, we will determine the value of the optimal order size It is necessary to note the tasks of organizing multi-product deliveries and the specifics of the practice of their implementation. With the existing development of methods for solving these problems, it is proposed to form a target offer based on the existing preferences of other groups of customers, taking into account the current market trends in relation to these products.
The traditional view of these issues was related to the development of recommendations on the degree of popularity and established views, trade practice and other issues of authenticity in the management of trade and intermediary activities. The digital development of the studied areas contributes to the attraction of methodological approaches to the development of targeted offers that recommend to consumers and users of services not only what is in demand, but also those goods and services that are most likely to suit them.
Collaborative filtering is a method of forming a target offer to the user, formed on the basis of evaluating the preferences of the studied sample of users. When forming the target proposal, a pairwise comparison of the indicators of the current user (for whom recommendations are made) with the indicators of the users of the studied sample is made, the group of the closest users of the sample to the current user is selected. The indicators of the users of the received group form the basis for the formation of recommendations.
Collaborative filtering differs from the simpler approach, which gives an average score for each order object, for example, based on the number of votes cast for it. In our opinion, it is necessary to conduct research and develop recommendations using sound methods and models. Frequency of user preferences

Legend
The scientific and practical basis for the implementation of procedures related to the preparation of target offers is the methods of data collection, processing and analysis. It should be noted that the formation of target offers can be implemented for various fields of activity related to the provision of various types of offers. This approach is based on solving the problem of determining the degree of proximity (connectivity) of the objects under study. Figure 1 shows data on user preferences. To implement the procedures for forming target offers, various methods of data processing and analysis can be used, which can significantly improve the efficiency of decision-making. These models include: -artificial neural networks [1]; -expert systems based on the use of fuzzy inference algorithms [1]; -statistical models; -models based on the application of game theory, etc. [2][3].
To implement the collaborative filtering algorithm, a formulated formal description of the problem statement is proposed.
Let there be a current user P, for which sets of recommendations are formed. The vector of its preferences ( 1 , 2 , … , ) is put in correspondence with P, forming the following formula where -a universal set of preferences. Similar (1) formulas exist for a representative sample of users . . → = ( 1 , 2 , … , ) ∈ , All the vectors of indicators presented above are displayed to sets of natural numbers The values of the vectors represent the frequency indicators of user requests to certain resources, goods, services, etc.
For the practical implementation of collaborative filtering mechanisms, it is proposed to use a number of procedures related to the processing of information obtained as a result of the experiment. Among the many methods of data processing, 4 methods are identified in this study: the method of analysis based on the calculation of metrics, correlation analysis, a method based on the calculation of a nonparametric statistical criterion for evaluating the similarity of individual numerical data samples, and methods of hierarchical clustering of data. Table 1 provides a brief analysis of the methods identified by the authors, from the point of view of their use in the collaborative filtering procedure.
Since the distribution of the data is not known beforehand and its distribution may deviate from the normal one, it is best to use the nonparametric criterion of mathematical statistics to assess the degree of similarity in the data samples.
To evaluate the indicators of demand for goods, we implement a procedure based on the calculation and evaluation of the Mann-Whitney U-test (U-test). This criterion, working with the ranks of observations, allows confirming or refuting the null hypothesis about differences in the data samples. There are restrictions on the size of the samples and the number of matches in them.

Analysis of hierarchical clustering
Multidimensional data analysis, the ability to obtain a sample of solutions, determine the level of significance of the evaluation results.
The complexity of the implementation, the problem in determining the optimal number of clusters and, as a result, their size.
Stages of the U-test calculation procedure: 1. The samples of the analyzed data are combined into one sample. 2. The data of the combined sample of indicators are ranked in ascending order of the attribute (speed of movement and mileage). 3. The sample obtained in point 2 is divided into two initial samples. 4. The sum of the ranks for the data samples is calculated, and the maximum rank for each sample is determined. 5. The value of the U-test is calculated using the formula: In the formula (34): -X n -the amount of data in the 1st data sample; -Y n -the amount of data in the 2 nd data sample; -max n -the number of data in the sample with the maximum rank.
1. According to the table of mathematical statistics, the critical value of statistical criterion is determined -кр U .

Confirmation or refutation of the null hypothesis. If the condition
is met, then the considered data samples are only slightly distinguishable. If the condition is not met, then the differences in the considered samples are significant.

Results
As an example, we will consider the situation with the frequency distribution by circulation for certain goods (table 2).
When ranking these samples, we assign ranks to objects by some attribute and, thus, metric values are converted to rank values. When performing ranking, in most cases, it is customary to assign the first rank to the object with the highest (or best) indicator (we assign a rank equal to one). However, it is also possible to perform reverse ranking of the data, which does not affect the result of the assessment. Next, we need to form a target offer using a statistical criterion. The iterations of this process are shown below. We implement the algorithm for calculating the criterion by combining samples, sorting and arranging the ranks for the pair of vectors -"Current user -User 1" (Table 3). Table 3. Implementation of the algorithm for the pair "Current user -User 1".

No
Combining Sorting and ranking. We separate samples with ranks and calculate the total rank (Table 4). Critical value кр = 1.
Since > кр -the null hypothesis about the similarity of the samples is accepted. I.e., to form a target offer, we can focus on User 1.
Since > кр -the null hypothesis about the similarity of the samples is accepted. I.e., to form a target offer, we can focus on User 2.
Since < кр -the null hypothesis about the similarity of the samples is rejected. I.e., to form a target offer, we can not focus on User 3.
The solution of the problem is associated with the evaluation of the numerical vectors of the current user and the sample of user vectors that form the evaluation base. The calculation of the statistical criterion shows that for the formation of the target offer in the order formation, we can consider the preferences of User 1 and User 2, i.e. recommend to the current user the trajectory of choice that Users 1 and 2 adhere to. We get a solution that allows forming a knowledge base of the target offer of the company in the studied market of goods.

Discussions
After performing calculations on the conditional example of forming orders based on preferences, we will consider the situation with the order of spare parts for cars. When contacting spare parts suppliers, the batch assembly is related to real needs, taking into account the maintenance and repair of rolling stock, based on the mileage, operating conditions and design features of the vehicle. Therefore, knowing the real picture and objectively evaluating it, which is called "from the inside", it becomes possible to offer and orient the current user to the sets that will allow him to rationally use the funds and predict possible extraordinary problems. With primary and one-time purchases, there is a lower risk of error when selecting the entire batch of goods, and when organizing a second modified purchase, it is necessary to take into account the emerging requirements and existing trends in the system under consideration.
In turn, the supplier has information on the basis of which the feasibility of combining the ordered goods into groups for control during the ABC-XYZ analysis is traced, certainly taking into account the possibility of joint storage. It should be noted that the dynamic ABC -analysis is based on taking into account the influence of the time factor on the structure and size of the sample for analysis (the observation period). The analyzed nomenclature items can migrate from one group to another. This is due to various circumstances -seasonality, product life cycle, economic situation on the market, fashion, etc. in other words, ABCmonitoring is necessary, which will allow identifying a stable migration trend and making the necessary management decisions.
We will consider the algorithm for solving the ABC -analysis for N nomenclature by the indicator R (average growth rate).
The average growth rate R . Ranking Cumulates, (%) Group А: Group В: In this way, we can divide the nomenclature series into a different, specified number of groups (A-B-C-D-E...). In this case, each group determines its own average growth rate of the criterion. According to Zaitsev E. I., the use of the generalized model differs from the considered one by the possibility to set any growth rate of the criterion for each group.
Let τ 0 = 100 n be the average growth rate of the criterion (n -is the number of nomenclature items). Then the set rate can be determined by the formula (41): where, к > 1 -is the above-average rate, 0 < к < 1-is the below-average rate.
If the growth rate is set to be 10% higher than the average, we get: With this in mind, the ABC -grouping algorithm will look like this. 1. Let R j be the variation series of the criterion in %, j=1, n К А , К В -are the coefficients of change in the average growth rate of the criterion for group A and B, respectively. 2. The growth rate of the criterion for group A.
3. Group A. The group includes all items for which The number of nomenclature items in the group n А is determined by the variational series R j sorted in descending order, and their share P А in % is determined by the cumulative series. 4.Group B. The average rate for the remainder of the nomenclature after the exclusion of group A is calculated.
The growth rate of the criterion for the group τ В = k В τ 0 В The group includes all items for which The number of nomenclature items in the group n B is determined by the variational series R j sorted in descending order, and their share P B in % is determined by the cumulative series. 5. The remaining nomenclature belongs to group C. 6. If necessary, we can continue the calculations by analogy, sequentially selecting groups C, D, E, etc. In the particular case, when the growth rate correction coefficients are equal to unity (к А,В,С ), we have an algorithm for grouping by the average growth rate of the criterion in each group.
This approach in the form of an algorithm is easily automated, can be used in multidimensional ABC-analysis using OLAP-class programs for sorting and grouping objects, which is becoming particularly importance and significance today [4].

Conclusion
Modern communications essentially determine the prospects for the development of logistics systems of orders and are implemented in the organization of work with suppliers and consumers of goods and services. Functionally, there is interaction between logistics, production systems design, market research, accounting, and finance. Accurate and timely communication, based on a high-quality forecast of needs, is the cornerstone of the implementation of logistics activities.
The order sent by the customer acts as the engine of the entire logistics process. For the successful functioning of the entire supply chain, coordinated management is necessary, taking into account the incoming information about the order parameters, the schedule of works and the shipment of goods. The speed and accuracy of order fulfillment in the logistics system have a significant impact on the level of customer service. The order processing cycle is a key area of interaction with the consumer and his assessment of the entire company's performance directly depends on it. Optimization of the chain "order -raw materialsproduction -warehousing -sales" currently determines the need for greater integration of logistics processes. Even the presence of the highest technologies at the manufacturer does not guarantee a reduction in the cost of production in the entire supply chain of products. Logistics technologies implemented in the practice of modern enterprises effectively remove the main contradiction between production and consumption, reducing the overall cost. Thus, the listed range of questions and the proposed answers is associated with the use of integrated logistics systems that cover the entire complex of intra-production logistics functions, together with the financial and information flows of the system under study.