Issue |
E3S Web Conf.
Volume 448, 2023
The 8th International Conference on Energy, Environment, Epidemiology and Information System (ICENIS 2023)
|
|
---|---|---|
Article Number | 02044 | |
Number of page(s) | 10 | |
Section | Information System | |
DOI | https://doi.org/10.1051/e3sconf/202344802044 | |
Published online | 17 November 2023 |
Machine learning approach to customer sentiment analysis in twitter airline reviews
1 Doctoral of Information System Department, Diponegoro University, Semarang, Indonesia
2 Physics Department, Science and Math Faculty, Diponegoro University, Semarang, Indonesia
3 Informatics Department, Science and Math Faculty, Diponegoro University, Semarang, Indonesia
* Corresponding author: eka.pujo@hangtuah.ac.id
Customers typically provide both online and physical services they use ratings and reviews. However, the volume of reviews might grow very quickly. The power of machine learning to recognize this kind of data is astounding. Numerous algorithms that could be employed for job of sentiment analysis have been developed to categorize tweets about airline sentiment into positive, neutral, or negative categories, this study compares the effectiveness algorithm for machine learning Naive Bayes (NB), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Adaboost, Extreme Gradient Boosting (XGB), Light Gradient Boosting Machine (LGBM), and Random Forest (RF) dividing the Twitter airline sentiment data into positive, neutral, or negative categories using the TF IDF model. The experiment involved two phases of activity: a classification algorithm utilizing SMOTE and sans SMOTE with Stratified K-Fold CV algorithm. With the RF model, the greatest performance accuracy for SMOTE is 97.56%. Without SMOTE, the RF with a value of 92.21% provides the maximum performance accuracy. The findings demonstrate that SMOTE oversampling can improve sentiment analysis accuracy.
Key words: airline reviews / sentiment analysis / machine learning / SMOTE / stratified k-fold CV
© The Authors, published by EDP Sciences, 2023
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.