Issue |
E3S Web Conf.
Volume 359, 2022
The 7th International Conference on Energy, Environment, Epidemiology and Information System (ICENIS 2022)
|
|
---|---|---|
Article Number | 05001 | |
Number of page(s) | 12 | |
Section | Information System Management and Environment | |
DOI | https://doi.org/10.1051/e3sconf/202235905001 | |
Published online | 31 October 2022 |
Auto Labeling to Increase Aspect-Based Sentiment Analysis Using K-Nearest Neighbors Method
1 Information System, School of Postgraduate Studies, Diponegoro University, Semarang, 50275, Indonesia.
2 Department of Mathematics, Faculty of Sains and Mathematics, Diponegoro University, Semarang, 50275, Indonesia.
3 Department of Informatics, Faculty of Sains and Mathematics, Diponegoro University, Semarang, 50275, Indonesia.
a) ahmadjazuli@students.undip.ac.id
b) widowati@lecturer.undip.ac.id
c) retno@live.undip.ac.id
Social media platforms generate many opinions, emotions, and views on all public services. Sentiment analysis is used in various institutions, such as universities, the business industry, and politicians. The evaluation process requires some data, both quantitative and qualitative. Researchers only focus on quantitative data but ignore qualitative data. The evaluation process given by students in the form of a review is qualitative data that is not structured, so it cannot use conventional methods. Unstructured data requires analysis as well as labeling. The labeling process of large amounts of data is a waste of time and money. Data labeling requires very high accuracy to avoid errors. Accuracy in data labeling is used for the process of classifying, training, and testing data. This study aims to automate data labeling using the K-Nearest Neighbors algorithm method. This labeling process can improve the accuracy of sentiment analysis. The results of the classification method can classify responses from Twitter users and can be used by universities as material for evaluating and assessing higher education services. The results of using a confusion matrix with 1.409 data obtained an accuracy rate of 79.43% with a value of k=15
Key words: Auto Labeling / Sentiment Analysis / Learning Process / K-Nearest Neighbors
© The Authors, published by EDP Sciences, 2022
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.