Intelligent Tutoring System Using Bayesian Network for Vocational High Schools in Indonesia

The lack of personal tutorials during school hours caused the learning approach at Vocational High Schools to be less optimal, and the student's competence is not achieved maximumly. Some computer-based self-learning systems have been developed as solutions to these problems. Unfortunately, the system's weakness is that learning does not pay attention to the diversity of students' abilities. Based on those, this research proposes an Intelligent Tutoring System (ITS) model using Bayesian Network at Vocational High Schools (SMK) to determine the level of students' abilities and teach skills competency materials based on each student's ability level. This is quantitative research with quasi-experimental using one group pretest-posttest design. The research participants were 69 students of the Computer and Network Engineering expertise program at SMK Negeri 4 Gowa and SMK Negeri 1 Gowa, South Sulawesi Province, Indonesia. The results showed significant differences in students' learning outcomes after the use of the proposed ITS; in other words, the proposed ITS was effective in increasing the skill competency of Vocational High School Students. The evaluation results showed that the created Bayesian Network model had a high level of accuracy, reaching 84%.


Introduction
Vocational High School (SMK) is one of the educational institutions that aims at providing knowledge and skills to students who enter the employment field and produce the skilled employee needed by the community [1]. Vocational school graduates are expected to be ready to be directly involved and compete in the corporate world that demands skills and competencies. The current educational method does not support the aforementioned concerning; the lack of personal tutorials during school hours resulted in some students missing out on some materials so that they could not proceed to the following material. Eventually, the learning approach became less optimal, and the student's competence did not the maximum which caused the teachers cannot fulfill all the students' needs due to the time constraints and overcrowding so that they ignore the gap in the level of understanding/ability among the students. Some Some computer-based self-learning systems have been developed as solutions to these problems, including [2,3], which are designed to provide alternative tutorials personally to vocational school students and improve their understanding at the same time. This solution becomes successful because many software is being developed to provide personal tutorials. Even though computer-based self-learning systems have been widely used in learning, the systems' disadvantages are that the learning does ignore the diversity of the students' abilities individually [4,5]. In other words, the research assumes that the knowledge level of all students is the same. Therefore, this learning tutorial approach is not optimal and does not reflect personal guidance between students and teachers.
Intelligent Tutoring System (ITS) is a computer-based learning system that works by assessing the students' characteristics so that the instructions given can adapt to the students' needs based on the principles and objectives of learning [6,7]. Compared to other computerbased self-learning systems, ITS improves learning weaknesses that do not pay attention to the diversity of the students' abilities (users) by teaching materials due to their abilities [7,8]. Intelligent Tutoring System (ITS) is a computer program that solves learning problems, monitors, trains, or consults with students, and has proven successful in improving their abilities [8][9][10][11][12].
Bayesian Network (BN) is a probabilistic graphical model using probability and graph theories that provides an approach to getting an inference or conclusion [13]. BN connects a set of variables with conditional probabilities that produce the probability or likelihood of an event occurring, which is considered using the Bayes Theorem [14,15]. Due to those capabilities, Bayesian Networks have been used in modelling knowledge and expert reasoning in many application domains [16]. Therefore, this research aims to propose and implement Intelligent Tutoring System using Bayesian Network in Vocational High Schools. This model allows ITS to infer students' abilities individually so that the learning can adjust to each student's ability level.

Research Design
This research is quantitative research with a quasi-experimental design using one group pretest -a post-test design that focuses on improving competence (learning outcomes) after the students use the proposed ITS. The research subjects were first given a pretest to determine the extent of the students' ability (understanding) at the beginning. Having been given the pretest, the students were then given the proposed learning treatment using the proposed ITS. After the learning process was completed, the students were then given the post-test to reveal the understanding level of the competency of their expertise after being given treatment (using the proposed ITS).

Sample and Research location
The samples in this study were the grade XI students of the Computer and Network Engineering (TKJ) expertise program in the academic year 2020/2021 at SMKN

Materials
To specify this research, the researchers took the competency standard in accordance with the case study conducted. There are 6 competency standards for computer and network engineering expertise program that will be adopted at proposed ITS based on the documentation of the lesson plan in the school. The competency standards can be seen on the Table 1 as follows. In addition, the instructional material preparation that will be taught to the students (domain module) is carried out in accordance with the predetermined competency standard.
The following Figure. 1 shows the structure of TKJ vocational materials. The structure of those instructional materials is also the basis for compiling 35 test items for the ability test (pre-test) and post-test. The instructional materials and the test have been checked (evaluation and validation) by 3 material experts.

Evaluation
Evaluation is conducted to measure the performance of the BN model developed using Cross Validation and Confusion Matrix. Cross-Validation is a statistical evaluation method that divides the data into two subsets, namely the exercise set and the assessment set. Cross-Validation is an effective way to test the success rate (accuracy) [17]. This method produces the value of the Confusion Matrix. Confusion Matrix is a performance measurement of a machine learning model in which the output represents the predicted result value and the actual value [18]. The Confusion Matrix is a table with four different combinations (representation of the predicted result and the actual value) containing True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN). From the information generated by the Confusion Matrix, it is possible to find the Precision, Recall, Accuracy, and F-Measure values to then determine the BN model accuracy.
In addition, an analysis was also carried out to determine the use effect of the proposed ITS on improving the students' competencies (learning outcomes) by examining the deviation difference between the students' pre-test scores and the post-test scores. The data analysis technique used is Normalized Gain (N-Gain).

Proposed Intelligent Tutoring System and its Implementation
The proposed BN module at ITS serves to provide material recommendations and determine students' ability levels into three categories, namely, low, medium, and high, as shown in Figure. 2. BN in the pedagogical module can conclude students' abilities individually. It uses the input to answer questions from the ability test. In obtaining quantitative input from the students, BN can carry out inference of grades quantitatively representing the students' abilities individually.
Having the students perform the ability test, BN then provides material and classifies the level of the students' ability by using variables obtained from the test results of ability. The pedagogical module (concept of teaching) has been validated by the experts in their fields (teachers). This teaching model cannot be modified, and the teaching strategies defined in this teaching model are fixed and permanent. A student interacts with ITS through the Interface Module. These interactions are divided into two sub-modules; input module and output module. The main purpose of the input module is to update BN based on the evidence collected by the ability test results. The student module will store student information along with material recommendations and the level of understanding of each student. The component of the domain module contains learning materials (text or video). This module will provide material in the pedagogical module based on information stored in the student module.
The proposed ITS using data input from the student ability test results; then it is carried out the process of determining the level of ability and material recommendations using the BN algorithm. The Output resulting from the algorithm is in the form of a recommendation of different instructional materials for each student according to the material mastery probability of BN. This is explained in terms of the information system in Figure. 3.

Bayesian Network Structure
The BN structure can be described based on the data from the ability test result, namely a series of questions (Q) totaling 35 items and representing 6 predetermined Competency Standard (CS). The last layer shows the ability level variable with parameters Low (L), Medium (M), and High (H). The BN structure can be seen in the following Figure.

Determining Parameter
After the BN structure is formed, the next step is to determine the prior probability value of each question, mastery of competency standards, and level of ability. The prior probability value can be seen on following Tables 2, 3, and 4.

Conditional Probability
Conditional probability is the probability of an event X of the event C occurs and it is denoted by P(C|X). the conditional probability of competency standard (CS) is calculated using equation (3).
in Equation (1), it is known that P(Qi…Qn | CSi) is probability of mastery of competency standards (CS) based on questions (Qi…Qn), where P(Qi…Qn ꓵ CSi) is probability of happening of both Q and CS, P(CSi) is probability of CSi (prior probability). Whereas, the conditional probability of the ability level (L,M,H) is calculated using equation (2).
where P(CS1…CS6 | L/M/H ) is probability of ability level based on competency standards, P(CS1…CS6 ꓵ L/M/H) is probability of happening of both competency standards and ability level, and P(L/M/H) is probability of ability level (prior probability). ( 1 … 6 ∩ / / ) = ( 1 … 6 | / / ) ( / / ) (3) where P(CS1 … CS6 | L/M/H ) is probability of ability level based on competency standards (conditional probability), and P(L/M/H) is probability of ability level (prior probability). The probabilistic inference or determining the ability level for a sample done by comparing the JPD value for each ability level and taking the highest probability value as the conclusion.

Implementation
The proposed ITS can be accessed on https://ismk.web.id through web browsers on both laptops and smartphones because the developed system has already supported responsive display that makes it easier for students to access the system anytime and anywhere. An overview of the ITS appearance that developed can be seen in the following illustration.  Figure 5 (A) shows that the Student Test page will appear the first time when the students are login into the system. This test is used to determine the level of students' ability. Having students done the test, the students' ability level will and material recommendations be stated on the page of ability test Result shown in Figure.

Result and Discussion
The experiment was conducted to obtain data used as the basis for measuring the effectiveness of the improvement of learning outcomes using the proposed ITS. The system was tested on the research sample that consisted of 69 students at SMK Negeri 4 Gowa and SMK Negeri 1 Gowa, South Sulawesi Province, Indonesia. The experiment was conducted for 4 weeks; they are the pretest. It was conducted on May 3rd, 2021, then continued with the treatment, in this case, the learning using the proposed ITS on May 3rd -May 26th, 2021 (4 Weeks), and ended with the implementation of the post-test on May 27th, 2021. The treatment result of the Bayesian Network on the proposed ITS is shown in the following Table 5.  Table 5. The results showed that as many as 20 students or 29% got a high ability level (H), 31 students or 45% got a medium ability level (M), and the remaining 18 students or 26% got a low ability level (L).

Evaluation of Bayesian Network Model
The evaluation was conducted to test the performance of BN model using K-fold crossvalidation with parameters K = 5 and K = 10, which resulted in confusion matrix tables. The data used were as many as 69 students taken from the experimental results. From the test result, it was obtained the values of accuracy, precision, recall and F1 score are summarized in the following table.  From the test result on K = 5, the accuracy value is 0.840 x 100% = 84%, the precision is 80.7%, and the F1 score is 85.5%. Whereas, the test on K = 10 has the accuracy value of 0.826 x 100% = 82.6%, precision of 79.8%, recall of 88.8%, and FI score of 84%. The data prove that the developed BN model has a high level of accuracy and stable.

Analysis of Learning Outcomes
Based on the experimental result, the researchers compared the obtained pre-test scores and the post-test scores. Broadly speaking, the results of the data comparison are shown on the following Table 8.  The analysis was then conducted in order to find out the extent to which the experimental result improvement by examining the deviation difference between the pre-test score and the post-test score by using the Normalized Gain (N-Gain) test. The results of the N-Gain calculation can be seen on the following Table 9. The calculation result on Table 9 shows that it is obtained the gain score or also known as the score change analysis of 0.4. The obtained value is then interpreted as the improvement criteria of the learning outcomes [19], in which the value of 0.4 is categorized as a medium score change. Therefore, it can be concluded that there is a significant improvement in learning outcomes after using the proposed ITS.

Conclusion
This research proposes an ITS model using Bayesian Network in Vocational High Schools that can determine the level of the students' abilities and teach skills competency materials based on each student's ability level. The proposed ITS experiment on 69 students of SMKN 4 Gowa and SMKN 1 Gowa resulted in as many as 29% of students categorized at high ability level, 45% of students at medium ability level, and the remaining 26% of students at low ability level. The results of the evaluation using K-fold cross-validation showed that the developed BN model has a high and stable level of accuracy, in which the test result of K = 5 has an accuracy of 84% and K = 10 has an accuracy of 82.6%. Based on the experiment conducted, the proposed ITS was proven effective in improving the skill competencies. It is based on the analysis of score change using the N-Gain test by obtaining the score change of 0.4, which is categorized as a medium score change evaluation was conducted to test the performance of BN model using K-fold cross-validation with parameters K = 5 and K = 10, which resulted in confusion matrix tables. The data used were as many as 69 students taken from the experimental results. From the test result, it was obtained the values of accuracy, precision, recall, and F1 score are summarized in the following table.change.