Crowd Face Detection with Naive Bayes in Attendance System Using Raspberry Pi

. PT. Restu Agung Narogong is a company with a total of 176 employees, queues often occur in the attendance process, both incoming and outgoing attendance. The employee needs to register their attendance. It is time consuming during the shift change. Therefore, a biometric system is needed to support the attendance system to identify employee without registering themselves. One of the alternative biometric systems is face recognition by using a computer vision. The purpose is to implement a crowd face detection with Raspberry Pi using the Naïve Bayes classifier. This system uses an algorithm to extract facial characteristics into mathematical data. Then the data is compared with data from other facial characteristics collected in the database. This device uses Python as a programming language with some of the scientific Python libraries. The testing of the Naïve Bayes method was conducted using a sample of dataset of 370 augmented facial imagery. The accuracy of this implementation is 76.31%, the precision is 78.25% and recall 81.25%. The background and lighting of the captured image affect the accuracy of this device.


Introduction
The rapid advancement of technology has resulted in the emergence of various electronic devices and software which help to assist human activities, so the time efficiency increases productivity if a company or organization [1].In a company, whether in a large or small and medium-sized company, attendance is an inseparable thing.Many companies use attendance as a basis for salary [2].One of the electronic devices for attendance which are developing now is by using fingerprint.Fingerprint is relatively effective to validate and ensure an employee attendance.Because fingerprint is a biometric technology that offer biological authentication which enable the system to recognize the user accurately [3].Biometrics recognition system, sometimes known as biometric system, is an authentication system using biometrics.The biometric system will automatically identify a person based on a biometric characteristic by matching that characteristic to a biometric feature that has been stored in the database [3].Besides fingerprint, other biometric systems commonly used are face recognition [4] and retinal scan [5].The advantages of fingerprint biometric system are secure, ease of use, non-transferable, and higher accountability [6].The disadvantages of fingerprint biometric system are Exclusions, cost, and system failure [7].The advantages of facial recognition system are robust, touchless, effortless, real-time, and valid.The disadvantages of facial recognition biometric system are privacy concerns, maturity of technology and storage [8].The advantages of retinal scan biometric system are most reliable, very quick, and unique data points [9].The disadvantages of retinal scan system are health risk to the eye and intrusiveness [10].Based on the afore mentioned advantages, fingerprint is the most common biometric system used especially for the attendance logging in the industry [11].However, it has weaknesses about scanner issues and physical traits.In this research, the application of facial recognition system as an alternative biometric system was studied, with the case study at PT. Restu Agung Narogong.This company is a manufacturing company with a total of 176 employees, queues often occur in the attendance process, both incoming and outgoing attendance.The employee needs to register their attendance.It is time consuming during the shift change.Therefore, a biometric system is needed to support the attendance system to identify employee without registering themselves.Based on the facial recognition technology in computer vision, the application of employee facial recognition can be used as an alternative method in monitoring the presence of the employees, which aims to conduct employee attendance that will capture many faces in a crowd.Employees are only required to face the camera while walking into the attendance gate.The system will detect many faces in a walking crowd.So, it can reduce the attendance queue that occurs if using fingerprint.In implementing one of the features of this system function, the system can later perform detection and facial recognition.The facial detection process is done using Naïve Bayes Classifier method that serves to classify employees' faces.Naïve Bayes Classifier is one of the algorithms used for classification and is a Machine Learning method that uses probability calculations and statistics put forward by Thomas Bayes.The algorithm is used to predict future probabilities based on experience [12].

Literature Review
Face recognition in a crowd has been studied by [13].In this research skin segmentation has been used to many procedures as in other methodological on face recognition.recently, partial face detection has necessitated and drawn considerable attention in the information science and technology community.However, the purpose of this study was to investigate new ideas for recognizing a specific individual in a crowd, as well as obscured face information.This article is the result of an in-depth investigation of face identification under partial visibility conditions such as chrominance, lighting, posture changes, and saturation, among others.in this article we give a presentation of new algorithm handling methods and system for the partial face recognition and perceive a specific individual in a crowd.Finally, we applied common classifiers such as neural networks, SVM, and HMM to assess the proposed approach and achieved a high detection rate.Another research on crowd detection has been done by [14].The fundamental drive of this article is to describe the idea of empty seat revelation system and thus track down the quantity of empty seats left vacant in a corridor.The efficient empty seat revelation system is accomplished by utilizing the combination of the Viola-Jones algorithm with template-based correlation matching.The suggested empty system is extremely efficient.This technology aids crowd management by updating the quantity of empty seats on a regular basis.Facial image recognition in [15] utilizes Naive Bayes for classifying the result of eigenface feature extraction.the normalization z-score is included to improve accuracy.to evaluate the performance of the suggested method, the 200 datasets are separated into data training and data testing by utilizing cross validation (k=10).The outcomes show that the suggested method can predict the facial image up to 70%.Additionally, in average, the prediction accuracy increases to 89.5 percent by including the normalization Z-Score.A system of face recognition for biometric has been proposed by [16].This research uses Raspberry Pi as its processing unit and utilize Pyimage search library for the face recognition.For the face recognition, it is accomplished by using David King's Dlib and Adam Geutgey's module.This article also uses standard frontal face using Haar Cascade classifier in the form of an xml file as its face detection.The dataset comprises of 5 persons, with 30 photographs, for a total of 150 photos.As for the size parameter in pixels variations include 20x20, 25x25, 30x30, and 35x35 and the scale factor parameters' values include 1.1, 1.2, 1.3, and 1.4.The neighboring parameters' variation values are 3, 4, 5, and 6.The test results reveal that the best parameters, namely the size parameter 20x20, scale factor parameters 1.1, and parameters neighborhood of 3, achieve the greatest Accuracy value of 80% and the True Positive Rate of 100%.Face recognition system testing is done at four distinct distances: 1.5 meters, 2 meters, 2.5 meters, and 3 meters.Size parameters, scale factor parameters, and neighborhood parameters are the three categories of testing parameters.

Research Method
The research framework is an overview that explains the logic flow of research in general.The research framework of this study can be explain using Figure 1.

Fig.1. Framework of Research
Based on Figure 1, the stages performed in this study can be divided into nine steps as follows:

Problem Identification Stage
This stage is to find out the problems that often occur in the attendance system at PT. Restu Agung Narogong.

Data Collection Stage
The data collection is done to support how the application will be created in it.Data collection is done by using interview methods, observation, and data studies.

Requirement
The requirement stage is done to generate the need for system development to be carried out such as software and hardware needs.

Design
At this stage, the system design will be built based on the idea of solutions for the problems that occur, including the design of appearance of the application.

Evaluation
The evaluation of the prototype is based on the design made to determine the suitability of the problem.

Coding
At this stage the design execution is carried out into the system to be built using the Python -MySQL program.

Testing
The system that has been developed will be tested to know the development and suitability of the program with the design using black box testing and evaluation of the model confusion matrix to know the percentage of models used.

System Evaluation
Evaluation of the system is the result of the advantages and disadvantages of the system that has been developed which will provide conclusions and suggestions for development.

System Implementation
This stage is the stage of using the system according to the needs of PT.Restu Agung Narogong.
Attendance system at PT. Restu Agung Narogong can be explained using Figure 2.

Fig.2. Working system analysis
Figure 2 shows the employee attendance system that runs at PT. Restu Agung Narogong.The process of the system can be explained as follows: 1. Attendance Officer Officers are available in advance to maintain, provide, and documenting employee attendance 2. Employee Employees register their attendance in the attendance book, then the book will be handed over to the absentee officer who will do the documentation (note the absence) 3. Attendance Recap Attendance that has been documented by the absentee officer is then gathered as a report.
The flowchart for the face recognition can be described using Figure 3 as follow:

Testing Set
A test set is a data set used to assess the strength and utility of a predictive relationship.The test set is obtained with two approaches, first using the face image contained in the face database.Second, using facial images obtained in real time or using video.
Capturing face images or face detection will be carried out by a webcam which will detect the user's face.After the user's face is detected, then the user's face image will be located (face locating).Then the application will track the face of the user (face tracking).The system used for face tracking is the Two-Dimensional System, where this system tracks p g g faces and outputs of the image space where the subject's face is located.In this application, the Haar-Cascade Classifier is used to detect human faces or subjects.The main basis for this classifier is the Haar-like feature.This feature uses changes in the contrast value between adjacent rectangles, compared to the pixel intensity value.

Feature Extraction
Feature Extraction is used to find the most appropriate image representation so that can be identified.The main task of feature extraction is intelligence and the ability to sense the similarities between the test set and the training set.This main task requires feature extraction to find the relevant distance measures in the selected feature space.

Projection of Test Image
The face image to be tested is projected onto the training model image to get the exact extracted features.

Feature Vector
A feature vector is a vector image.Where in the vector has a random variable with the possibility of a face or not.

Classifier
The classifier in this application uses Naïve Bayes.
Where Naïve Bayes is used to classify data (feature vector) based on probability.

Decision Making
After the smallest probability is known and the test vector has been classified as belonging to a certain subject class, then a decision is made.If the test vector has been classified as belonging to a particular subject class, the decision is made that the subject in the test set is the same as the subject in the training class set.

Results and Discussion
The hardware that has been assembled can be seen in Figure 4 as follows: The design interface for this application is as follows: Employee data page Employee data page is a page to display information for each employee that has been added.This page displays the employee's name and NIK.This page can also be used to edit and delete data.In this study, the data used for the experiment was taken first from the photo of an idol band member "one direction".The members are including Niall Horan, Liam Payne, Louis Tomlinson, and Harry Styles.They have different facial characteristics which will be stored in the system database used as training.Then on the analysis stage, the test image will be tested using one direction photos simultaneously, and then they will be matched to the face image that has been stored in the system databases.The data were taken from 4 members using 3 photos with a different pose.After the augmentation process, 370 photos were obtained.Testing the confusion matrix method is carried out using a sample dataset of face images from several band personnel "one-direction" face images as the following test set: The result from each class is obtained as follows: = 100% = 76.31% .Based on the calculation in Table 4, the accuracy of the application based on Table 3. above is 76.31%, the average precision is 78.25%, and the recall average is 81.25%.

Conclusion
The conclusions of this study based on the implementation that has been done, are as follows: 1.A Crowd Face Detection attendance model has been made using a Raspberry Pi and a camera with the Naïve Bayes Classifier method which has data storage features on the server, the tool can capture employee faces simultaneously to perform attendance at the same time.
2. The results of the evaluation using the confusion matrix, the accuracy of the application is 76.31%, the average precision is 78.25%, and the recall average is 81.25%.3. The results of the evaluation from users, this face recognition application is easy to use and useful if used as an attendance system, however for the information displayed the user interface needs to be improved as needed.Based on the results of application implementation, found suggestions for application development that can be done in the next research are as follows: 1.Using other algorithms, such as CNN (Convolutional Neural Network) for face recognition in the first step or process, namely face detection.2. Using more complex and detailed features, it not only detects the faces of several people, but also recognizes eyes, voices, and photos.3. When facial recognition is done using a webcam, there should not be too much interference behind the user (noise), it is better to use a plain background and a supporting lighting system.4. The camera used has a minimum resolution of 8 megapixels with the distance of the user's face to the camera not less than ± 4 m at the time of facial recognition.

Fig. 3 .
Fig.3.Flowchart Software Design The flowchart stages are explained as follows: 1. Face Database Face database is a collection of face images used in a face recognition system.The face images contained in the face database can be used as a training set or a testing set.2. Training Set A training set is a data set used to find potentially predictive relationships.The face database must have an image of the faces of each person or subject in the training set.The facial images in this training set should represent a front view of the person or subject with little difference in point of view.The training set should also include different facial expressions, different lighting, background conditions, and the use of attributes on the face.This training is set with the assumption that all images have been normalized for the m X n array and that the facial image in the training set is only the face area and does not have many other limb images.3.Testing SetA test set is a data set used to assess the strength and utility of a predictive relationship.The test set is obtained with two approaches, first using the face image contained in the face database.Second, using facial images obtained in real time or using video.Capturing face images or face detection will be carried out by a webcam which will detect the user's face.After the user's face is detected, then the user's face image will be located (face locating).Then the application will track the face of the user (face tracking).The system used for face tracking is the Two-Dimensional System, where this system tracks

Fig. 5 .Fig. 6 .
Fig.5.Employee data page Training data page This page is a page to upload data training.This page can be used to add or delete the data training

Fig. 7 .
Fig.7.Attendance page Face Recognition During the Testing Process

Fig. 8 .
Fig.8.Face Recognition During the Testing Process The result of Implementing of the system for face detection at employee attendance can be seen at Figure 9. Base of this Figure the application could detect employee at the gate. p

Table 1 .
Data training

Table 2 .
Data testingTo generate more data to train more models, the training data is augmented using keras method.This using python library Keras.Using Keras can be used as follows: , to enlarge the image in a random order.horizontal_flip,flipping half the image horizontally randomly.fill_mode, fill recently formed pixels, that may appear because of rotation, width, or height shift. zoom_range

Table 4 .
The Result of Testing model