Prediction of The Level of Public Trust in Government Policies in the 1st Quarter of The Covid 19 Pandemic using Sentiment Analysis

The covid-19 pandemic has made changes in society, including Government policy. The policy changes led to mixing responses from the public, namely netizens. Netizen shares their opinion in social media, including Twitter. Their opinion can represent the public’s trust in the Government. Sentiment analysis analyses others’ opinions and categorises them into positive opinions, negative opinions, or neutral opinions. Sentiment analysis can analyze large numbers of opinions so that public opinion can be analyzed quickly. This paper explains how to analyze public trust using sentiment analysis and to use Naïve Bayes classification method to analyze sentiment. The data research was taken from Twitter in the first quarter of the Covid-19 pandemic, with around 3000 tweets. The tweets were related to Covid-19 and the Government from several countries such as the United States, Australia, Ireland, Switzerland, Italy, Philippines, Sri Lanka, Canada, Netherlands, United Kingdom, Germany, and Lebanon. This study aims to determine the level of public trust in the Government in the first quarter of the Covid-19 pandemic. The research result is expected to be used as a reference for the public policy stakeholders to determine future policies.


Introduction
Covid-19 has become the most popular topic, often discussed worldwide, since the World Health Organization (WHO) set Covid-19 into a global pandemic. Covid-19 has made several changes in social life, including policies set by the government. Many countries have set lockdowns or prohibited their citizens from leaving their homes to prevent the spread of the coronavirus. The policies that the government has set have caused various reactions from the public. Many people respond to government policies through social media such as Twitter.
The use of social media is currently proliferating. When the research was conducted, social media users reached 3.8 million users [1]. Many people use social media to share their opinions and thoughts. One of the social media platforms that people widely use to share their opinions today is Twitter. Through Twitter, many people disseminate various information. Based on these data, it can be concluded that many Twitter users are voicing their opinions regarding current issues, including information about the Covid-19 pandemic. Opinions in the form of tweets can represent public confidence in the government.
Public trust is an essential variable in building good governance. Public trust generates legitimacy that can create social assets for the government, used to gain political support and government activities [2]. In this case, sentiment analysis needs to be done. Sentiment analysis can automatically categorize opinions with pre-existing data and learning data. Sentiment analysis can categorize responses from the public into positive, negative, and neutral sentiments [2]. The better way to analyze public trust in the government uses sentiment analysis because this method can classify opinions in large numbers to shorten the time and energy required. However, an analysis of public trust in government policies during a pandemic is fundamental. Public trust analysis is needed because public trust is the most crucial element of the general administration's legitimacy [3].
Naïve Bayes is used because this algorithm has higher accuracy than other algorithms such as decision tree and random forest. This is approved by the research conducted by Fitri, Andreswari, and Hasibuan on the research: LGBT Campaign in Indonesia using Naïve Bayes, Decision Tree, and Random Forest Algorithm [4]. Their research resulted in data accuracy with Naïve Bayes is 86.43% and with Decision Tree and Random Forest is 82.91%. Another reason for using Naïve Bayes in this study is that Naïve Bayes has a good performance in sentiment analysis. According to the research conducted by Mubarok et al. [5], it is stated that the result of their research resulted in Naïve Bayes being suitable in his research by producing the best F1-Measurement of sentiment analysis is 78.12%.
The purpose of this study was to analyze public trust in the government during the Covid-19 pandemic. The main objective of this study is to determine the degree of public trust in Government policies during a pandemic through sentiment analysis on Twitter.

Research Method
Sentiment analysis can be done by several methods, one of which is using Naïve Bayes. Naïve Bayes is a method in analysis that uses Bayesian probability. This method has several processes: data collecting, pre-processing data, and applying the Naïve Bayes classifier.

Data Collection
This study takes data from Kaggle.com [6] and recalculates using a different method. The original data was analysed using the Natural Language Processing (NLP) method, and this study uses the Naïve Bayes classification method. The original data amounted to 3000 tweets, but the data used were 100 tweets-the final number of tweets obtained by using a simple random sampling technique. The sample size uses the following formula: Where: n = a sample size N = Population size e = the desired margin of error

Pre-processing Data
This process is necessary because the data from Twitter still contains content other than sentiment such as emoticons, website links, hashtags, white space, etc. This process removes website links, special symbols, usernames (start with @), hashtags, and website, and also transforms emoticons into word equivalents. Pre-processing is carried out so that the sentiment analysis more accurate [7].

Naïve Bayes Classification
The primary methodology of research on sentiment analysis on tweets, especially using the Naïve Bayes Classifier method, is to classify tweets into positive, negative, or neutral tweets. Naïve Bayes is a statistical analysis algorithm that processes data on numerical data using Bayesian probability. Naïve Bayes Classification classifies texts based on keywords probability in comparing training documents (past knowledge) and test documents (observed data) [7]. This study uses the Naïve Bayes classification because it is an algorithm that fits the purpose of this study. Naïve Bayes classification assumes that a particular feature in a class is not related to the existence of other rule [8]. Here is the formula: Where, P(c) = class prior probability P(Z) = predictor prior probability P(c|Z) = posterior probability P(Z|c) = predictor prior probability

Evaluation
The last stage after the data is classified using the Naïve Bayes algorithm is the evaluation stage. This stage aims to determine the accuracy of the algorithm that has been carried out. The result of the evaluation can be used to find conclusions. In the classification process, there is a False Statement [4]. This means that in the classification process, errors can occur in grouping sentences. For example, a sentence is classified into a positive sentence even

Data Collection
The original data or the population size (N) = 3000 and e = 10 %, so the calculation becomes: = 3000 1 + 3000(0.1) 2 = 96.774 ≈ 100 Due to fear of #COVID2019 people have started hoarding items of daily use. This will impact the regular level of supply in markets, if these commodities will disappear from markets it will create shortage and prices will go up. Then people will start cursing government.

Pre-processing Data
This process removes elements other than sentiment such as website links, special symbols, usernames (start with @), hashtags, white space, and transforming emoticon into word equivalents. Here is the example: Tweet before pre-processing data: #Coronavirus is "an exposure of all the holes in the social safety net" says NELP Government Affairs Director Judy Conti Tweet after pre-processing data: is an exposure of all the holes in the social safety net, says NELP Government Affairs Director Judy Continu.

Naïve Bayes Classification
This process is to classify the tweets with Bayesian Classifier. Before classifying tweets, group the training data into positive, negative, and neutral tweets.
After the data training is group, the next step is to calculate the probability of each word in data training based on the positive or negative or neutral tweet. The data test can be calculated with formula number 2. That step provided all the probability of each word in each category (positive, negative, or neutral). To classify each sentence in the test data, use the probability of each word based on the result provided by the before step. The result of each category needs to be compared to know the sentiment of the tweets; the highest result will decide the category of that sentence. Based on the result, from the 100 data, obtained 18 tweets declared true positive, 26 tweets declared true negative, 3 tweets declared true neutral. From the result the accuracy of this research is 44.44%.