Issue |
E3S Web Conf.
Volume 475, 2024
InCASST 2023 - The 1st International Conference on Applied Sciences and Smart Technologies
|
|
---|---|---|
Article Number | 02009 | |
Number of page(s) | 10 | |
Section | Environmental Impact Assessment and Management | |
DOI | https://doi.org/10.1051/e3sconf/202447502009 | |
Published online | 08 January 2024 |
Comparison of the K-Means method with and without Principal Component Analysis (PCA) in predicting employee resignation
Informatics Department, Faculty of Science and Technology, Sanata Dharma University, Yogyakarta, Indonesia
* Corresponding author: iwan@usd.ac.id
Employees are individuals who work for a company or organization and receive a salary. Employees are the most important assets that need to be effectively managed by the company in order to maximize their contribution. However, many employees feel dissatisfied with the outcomes of their contributions to the company, as they do not receive the expected rewards. This study utilizes a dataset from Kaggle.com, consisting of a total of 14,999 data rows with 10 attributes. In the first experiment, the dataset was reduced using PCA before applying the K-means clustering method. In the second experiment, the dataset is directly fed into the K-means clustering method without PCA. To evaluate the clusters in the K-means method, this study applies the sum of squared error (SSE) method and the silhouette coefficient method to determine the optimal clusters. The study concludes that there are two dominant factors, last_evaluation and average_monthly_hours, that contribute to employees resigning from a company. The SSE evaluation indicates that both methods have an elbow point at 3 clusters, suggesting that dividing the data into more than 3 clusters does not provide significant additional information. The silhouette coefficient evaluation shows that K-means without PCA obtain the best silhouette coefficient value of 0.5674, while K-means with PCA obtain a silhouette coefficient value of 0.5491. Although K-means with PCA have the advantage of reducing the dimensionality of the dataset, they have a longer execution time compared to K-means without PCA, with an execution time of 181.53 seconds for K-means with PCA and 95.84 seconds for K-means without PCA.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.