Methods of Machine Learning in the Master’s Educational Program

This article considers methods of machine learning, which are introduced into the master’s educational program under the direction of “Organization and management of knowledge-intensive industries”. This direction should be primarily focused on the digitalization of education. Digital economy, which is rapidly becoming part of modern management and production methods, is changing approaches to education and universities, which should graduate people who comply with the requirements of the digital job environment. The study is aimed at demonstrating the capabilities of machine learning methods in case of their introduction into the disciplines of this direction. Thus, we can switch from a qualitative description of most disciplines in this direction to a quantitative interpretation of the results. The task at hand is best solved by such machine learning tools as neural and fuzzy systems that can be used to solve classification, regression and clustering problems. We have analysed the composition of disciplines and have chosen the most important ones in terms of the introduction of machine learning methods into them. The article presents the possibilities of using machine learning methods by the example of a number of practical exercises that are included in the programs of disciplines of this direction. We have identified a number of disciplines of this direction, which need to be supplemented with additional machine learning materials. The article offers the composition of such materials, including theoretical foundations and practical exercises in selected disciplines. The study provides solutions of the most important practical tasks from various disciplines, obtained with the help of Statistica and MatLab software products.


Introduction
According to the requirements of the Federal State Educational Standard (FSES) under the direction of "Organization and management of knowledge-intensive industries", the area of professional practice of graduates who have mastered the master's educational program includes such indicators as high-tech product lifecycle management, quality management of high-tech enterprise management, etc. [1]. The structure of the master's educational program includes basic and variable parts, which are formed by participants of educational relations. This approach makes it possible to implement master's programs with different directions within the same training program. The disciplines included in the basic part of the master's program are compulsory, and the disciplines related to the variable part determine the direction of the program. In the curriculum of this direction, the variable part contains a number of disciplines which can be supplemented by the elements of machine learning with a view to create a trend for the digitization of the educational process.
According to the authors of this article, such disciplines include "Research methods in management of knowledge-intensive industries", "Human resourcing of knowledgeintensive industries", "Methods of research and risk assessment of innovation activity", "Methods of social and economic forecast". The study [2] presents various approaches and prospects for the use of information and communication technologies (ICT) in various educational processes. In the area of training, they seek to switch to technological learning (Technology-enhanced learning -TEL), which is usually considered as a replacement for the former term "e-learning". Different researchers often consider TEL as a synonym of equipment (infrastructure) and its use in education. However, current educational and teaching trends are more focused on using a wide range of ICT support to make these processes more convenient and attractive for both students and teachers. There is still a lot of confusion in this area about how key stakeholders can develop technologies that improve learning, how to measure this improvement, and how to assess its impact on the learning process.
As noted in [3], a paradigm shift in the production, known as Industry 4.0, imposes changes on the sharing of work between humans and machines. On the one hand, human labour is facilitated by intelligent devices and machines (human-machine cooperation), and, on the other, there is interaction and exchange of information with intelligent machines (human-machine collaboration). Digital technologies and cognitive computing are shifting traditional production boundaries. Consequently, we can introduce two types of learning as follows:  human learning (a human as a student);  machine learning (intelligent machine or computer as a student). A person who undergoes training is considered as a subject in the field of education, pedagogy and cognitive psychology in relation to various learning theories. In the context of cognitive computation, artificial models and computational algorithms resemble human learning and reproduction of human skills. A model as the main element of this process is built automatically using machine learning methods.
In the age of global competition, German manufacturing companies face enormous challenges, which are particularly associated with the struggle for competition with low cost countries. Industry 4.0 now provides new opportunities to improve the efficiency of resources and processes, ensuring autonomy, decentralization and networking. In this context, support systems play a key role when it comes to the increasing complexity of production systems, minimization of downtimes of machines and automated assembly lines, as well as on-the-job training. To ensure the active use of assistance systems, it is imperative to develop training facilities that support working neutral tasks in the context of Industry 4.0. In the framework of training facilities, assistance systems provide a digital learning environment [4].
Recommender systems use algorithms to provide users with recommendations on a product or service [5]. Recently, these systems started using machine learning algorithms from the area of artificial intelligence. However, it is difficult to choose a suitable machine learning algorithm for a recommender system because of the number of algorithms described in the literature. Researchers and practitioners, engaged in the development of recommender systems, have little information about existing approaches to the use of algorithms. In addition, the development of recommender systems using machine learning algorithms often involves problems and questions that need to be resolved.
Artificial intelligence, including neural networks, deep learning and machine learning, has achieved many successes and has provided new opportunities for academic research and applications in many areas, especially for business activity and the development of companies. The study [6] summarizes various applications of artificial intelligence technologies in several areas of business administration, including finance, retail trade, manufacturing, and enterprise management. Despite all the existing problems, we can draw a conclusion that the rapid development of artificial intelligence will have an impact on more areas.
Further, the article has the following structure: first, it considers knowledge-intensive industries and their influence on the formation of the educational process, analyses the connection between artificial intelligence and knowledge-intensive industries, and then it shows the composition of artificial intelligence, of which machine learning is a subset. In the final part of the article, the authors demonstrate solutions of the most important practical problems using machine learning methods based on the example of the chosen disciplines.

2
Materials and methods

Knowledge-intensive industries
Currently, knowledge-intensive industry is an important part in any economy. Knowledgeintensive industries are those that have high absolute and relative (relative to the total production costs) research and development costs. The concept of "high-tech" production largely overlaps with the concept of "knowledge-intensive" industry. Knowledge intensity is the level of research and development costs in the total cost of production. At the same time, it is difficult to assess the knowledge intensity of some industries. This is primarily due to the complexity of calculating the total costs connected with science. In world practice, there is no single methodology that provides the classification of knowledge-intensive industries.
According to the US National Science Foundation (NSF): "The absolute levels of R&D costs indicate the level of effort to produce future products and improve processes while maintaining current market share and improving operational efficiency. Moreover, such costs may show that companies acknowledge market demands for new and improved technologies [7]. However, R&D intensity is the most frequently used measure to assess the relative importance of R&D in various industries and among companies in the same industry.
The R&D intensity differs in different sectors:  high-tech (airplanes and spacecraft, electrical equipment and pharmaceuticals) are characterized by the highest R&D intensity;  low-tech (food, metallurgy and textiles) usually have low R&D intensity. NSF's investments in discovery and innovation provide basis for new technologies. Transfer of money from the NSF's budget, amounting to 7.5 billion dollars as of 2017, to sponsor research projects and infrastructure led to discoveries that stimulated economic growth, improved the quality of people's life [8].
According to the study [9], the high-tech sector can be defined as an industry with a high concentration of employees in such areas as Science, Technology, Engineering, and Mathematics (STEM). Although we know that it is difficult to define the term "high-tech" due to the fact that technology changes all the time, this analysis provides an approach to determining the jobs in this sector. STEM is an acronym that is often used to identify professions, as well as areas of training in science, technology, engineering, and mathematics.
Achievements of any company depend on various factors and the cost of technology, goods or services that are offered by the company. The basis of successful work is determined by the company's management methods: the level of management of subordinates by superiors, the use of modern technologies in management [10]. It is becoming crucial to introduce an infrastructure that will make it possible to get maximum effect from the use of promising management technologies. The latter comprise artificial intelligence tools and methods.

Machine learning
Artificial Intelligence (AI) is a broad and hardly formalizable concept that is associated with the presentation of knowledge, its extraction and subsequent processing [11]. Those companies that use AI methods have a significant competitive advantage over those that do not pay serious attention to this issue.
A significant amount of investment in AI consists of internal costs (R&D costs) of large companies. The large corporate investments are focused on various AI components that depend on the interests of companies. About 60 percent of investment is directed at machine learning, as it is a tool for many other technologies and applications [12]. We also need to take into account that the boundaries between various technologies have recently become blurred.
Machine learning is a group of AI methods, a characteristic feature of which is not a direct solution of the problem, but learning in the process of solving. The major issue associated with the use of machine learning methods comes down to working out a solution for assessing the membership of the observed object in one or another group. Methods that are used in AI for data processing are part of machine learning, which is a subset of AI.
According to the definition given in the study [13], machine learning is a method that forms a model based on data. Thus, a model is the final output of machine learning that is suitable for tasks related to intelligence, in particular, in situations where physical laws or mathematical equations make it impossible to build a model. The process of building a model based on training data is shown in Figure 1. The vertical arrows in Figure 1 define the learning process, and the horizontal arrows indicate the use of the model. It must be emphasized that the data for building the model and its applications are different.
Machine learning teaches computers to do what is natural for a person: learn by doing. ML algorithms use computational methods to "study" information directly from data, without relying on a predetermined equation as a model. Algorithms adaptively improve their performance as the number of samples available for training increases.
To develop learning machines, it is required to know what the term "learning" means and how to determine the success (or failure) of learning. Based on the training data (sample), the training algorithm should induce the function f, which will show the new example submitted to this algorithm, to the appropriate prediction (class or value) [14]. Among ML methods, the most interesting in the field of knowledge-intensive industry management are the methods based on the use of neural networks, fuzzy logic, as well as hybrid neuro-fuzzy systems.

Results
Let's select the following disciplines of the direction "Organization and management of knowledge-intensive industries" to include ML methods in them:  Research methods in management of knowledge-intensive industries.  Methods of research and risk assessment of innovation activity.  Human resourcing of knowledge-intensive industries.  Methods of social and economic forecast. Students need to complete practical training in these disciplines, and below are some examples of their completion.

Research methods in management of knowledge-intensive industries
As an example, let's consider the problem of choosing a business strategy using a neural network (NN). Materials on the NN can be found in [15]. To train a network, we will need a base of examples, which can be obtained in two ways:  we can use real indicators of companies which, according to the developer of this technique, will determine the choice of strategy;  we can apply the set performance data, using, for example, the Monte Carlo method. To have a more specific solution, let's assume that an enterprise produces high-tech products, in particular, a technical equipment performance monitoring system. Let's choose the following indicators which will determine the business strategy: Х1 -level of prices for high-tech products; Х2 -manufacturing quality level; Х3 -level of product technical support; Х4 -degree of conformity of the product range to the consumer needs; Х5product innovation frequency; Х6 -support of long-term consumer relations; Х7 -level of costs in case of updating of products; Х8 -control over product distribution channels; Х9level of utilization of production capacities.
Moreover, let's assume that this area of activity comprises three strategic groups of competitors with the different values of the above indicators.
To solve the problem in Statistica 13 program, we will use a perceptron network with an input, a hidden and an output layer. The number of neurons in the input layer is determined by the number of signs (nine), and the number of neurons in the output layer is determined by the number of groups (three) into which the initial data set was divided. The composition of neurons of the hidden layer is formed by the program itself, saving the network with the best architecture in terms of training error. Having analysed the results, we have selected the best network, the characteristics of which are shown in Fig. 2.
Broyden-Fletcher-Goldfarb-Shanno algorithm (BFGS) was used for network training. This algorithm is an iterative method for solving optimization problems.  Fig. 2 shows that the selected neural network is a perceptron with the formula 9-12-3, i.e. a network with 9 neurons in the input layer, 12 neurons in the hidden layer, and 3 neurons in the output layer.
Before using the NN in the working mode of the network, the objects that had already been "seen" by it during training were alternately presented, defining the membership in the first, second, or third groups of selected strategies. For example, the first three lines in the lower window of Fig. 3, starting with the Var1 indicator, demonstrate the values of the entered parameters Var1 -Var9 with a known membership in the groups.
The second column of the lower window in Fig. 3, designated as 1.Var10, shows that the trained NN correctly recognizes these objects. The fourth and fifth lines contain the values of the new strategy (enclosed in a rectangular frame), and the network includes enterprises with such "unknown" strategies into the second or first groups. Fig. 3 The work of the neural network

Human resourcing of knowledge-intensive industries
Let's see how to solve the problem of quantitative assessment of employee motivation. To determine the number of factors that have influence on employee motivation, we use the technique proposed in [16], which identifies four factors, which exist in each company and have an impact on employee motivation:  management style;  system of remuneration;  organization climate;  nature of work. When choosing methods for solving such a problem, let's apply fuzzy logic (FL) due to the fact that it imitates human activity by processes that are fuzzy, uncertain, and are usually expressed in the form of linguistic terms [17]. Figure 4 shows the fuzzy inference system obtained using MatLab, which consists of the four input variables mentioned above and the output parameter that determines the motivation point.

Fig. 4 Fuzzy interference system
As a rule, FL determines ranges of changes, the number of gradations, types of membership functions for each parameter. Table 1 shows the values of the ranges of change for each variable, the established gradations, and the selected membership functions.
Once the type of the membership function is chosen and the rule base is formed, the work of the created system can be assessed through the option "View Rules", a fragment of which is shown in Fig. 5.
Let's assume that the levels of management, remuneration, degree of organization climate, nature of work are determined by the values indicated in the input window of Fig.  5. Thus, the degree of motivation of such an employee will be 80 points on a 100-point scale. Let's clarify that the lower limit of the scale characterizes the lack of motivation, and the upper one -the greatest degree of motivation.  Fig. 5 Operation of fuzzy interference system

Methods of research and risk assessment of innovation activity
According to the results of innovation, risk and project efficiency assessment indicators are equivalent, but they have different physical meaning: high risk -low efficiency and vice versa. Therefore, we will demonstrate the solution to the problem of assessing the efficiency of an investment project. Combining NN and FL into one hybrid system makes it possible to eliminate the disadvantages of certain technologies and to create a neuro-fuzzy system (NFS) of the ANFIS (Adaptive Network-Based Fuzzy Inference System) type [18,19].
The use of the ANFIS system requires a base of examples, which, in our case, is the data of the financial plan of a certain enterprise. The input indicators will include sales revenue (Х1), cost of production (Х2), advertising costs (Х3), operating costs (Х4), loan charges (Х5); output parameter is profit (Х6).
The data loaded into MatLab program is shown in Fig. 6, where 22 circles identify the training sample, and 8 points identify the test sample. The structure of the formed ANFIS system is shown in Fig. 7, demonstrating that the NFS has 5 layers: the first layer, consisting of 5 nodes, corresponds to the input parameters; the second layer determines the number of membership functions for each variable; the third layer is the number of rules formed (14 in total); the fourth layer shows the output values corresponding to each rule; the fifth layer is the system output. The NFS test results are shown in Fig. 8, including 8 pairs of points: test data and results obtained by the ANFIS system.

Fig. 8 Test results
The system functioning is illustrated in Fig. 9 (a fragment of the "View Rules" window is shown): when entering five values of the vector of input features, the system evaluates the effectiveness of this project version at 96 points on a 100-point scale.

Methods of social and economic forecast
When studying this discipline, along with the traditional methods of forecasting, set forth, for example, in studies [20,21], we will show the possibility of applying the neural network approach to form the forecast of a time series (TS).
Usually, the next value of TS is forecasted based on some number of its previous values. Once the next estimated value is calculated, it is substituted back and is used (as well as previous values) to obtain the next forecast, which is called the time series projection. This approach is sometimes called the sliding window method. We will illustrate this using the following example. Let's take a series of changes in Gazprom shares for 150 days as the TS, as shown on the graph in Figure 10.
The forecast was made in Statistica 13 program using a three-layer perceptron (multilayer perceptron -MLP). Based on the learning outcomes, the program saves the top 5 networks selected by the largest learning error.  Let's use the network with the smallest training error and make a forecast for 10 points ahead. The result of the forecast is shown in Fig. 12.  Fig. 12 Forecast of a series by 10 points ahead When studying this discipline, the practical training should include tasks for forecasting TS not only with the help of traditional methods, but also with the help of the NN techniques.

Conclusion
Thus, the article shows that the work programs of disciplines under the direction of "Organization and management of knowledge-intensive industries" should include materials on the use of machine learning methods. It will improve the level of knowledge gained by students and will increase their competitiveness in the labour market.