Sentiment Analysis Of Indosat's Mobile Operator Services On Twitter Using The Naïve Bayes Algorithm
E-ISSN : 2807-9035
Volume 4, Number 1, May 2024
https://doi.org/10.47709/brilliance.v4i1.4084
Sentiment Analysis Of Indosat's Mobile Operator Services On Twitter Using The Naïve
Bayes Algorithm
Sufajar Butsianto1*, Sifa Fauziah2, Candra Naya3, Futuh Maulana4
1,2,3,4
Informatics Engineering Study Program, Faculty of Engineering, Pelita Bangsa University, Indonesian
, , ,
4
1
*Corresponding Author
Article History:
Submitted: 12-06-2024
Accepted: 13-06-2024
Published: 28-06-2024
Keywords:
Twitter, Data Mining, Naïve
Bayes, Rstudio
Brilliance: Research of
Artificial Intelligence is licensed
under a Creative Commons
Attribution-NonCommercial 4.0
International (CC BY-NC 4.0).
ABSTRACT
Twitter is a social media that allows users to share information with others in
real time. Information that is shared on Twitter is usually referred to as a tweet.
Sentiment analysis is a branch of research in the text mining domain where the
process of identifying and extracting sentiment data will usually be categorized
based on its polarity, whether it is positive, negative or neutral. We can process
data from opinions on Twitter using data mining techniques, namely
classification. The algorithm that will be used in this research is the Naïve Bayes
Algorithm. This research will also use the RStudio application. It is a computer
programming language that allows users to program algorithms and use tools
that have been developed through R by other users. R is a high-level
programming language and is also an environment for data and graph analysis.
Based on the experimental results, using a comparison of training data and test
data of 20%: 80%, 40%: 60%, 60%: 40%, 80%: 20% and 90%:10%, the results
of sentiment classification using the Naïve Bayes method are obtained. and
using 10-fold cross validation obtained an average value of 85.00% accuracy
and The decrease in machine learning performance occurs in the ratio of 80:20
or 1440 training data: 360 data testing, while the ratio of 20%:80% and
90%:10% has the same accuracy value, namely 85.41%.
INTRODUCTION
The development of the world of technology and information is increasingly moving in a digital direction day by
day. The digital era has made humans enter a new lifestyle that cannot be separated from electronic devices.
Technology is a tool that helps human needs, with technology everything can be done more easily. Challenges in the
digital era have also entered various fields such as politics, economics, social culture, defense, security and information
technology itself. The digital era was born with the emergence of digital, internet networks, especially computer
information technology. The new media of the digital era has the characteristics of being able to be manipulated, being
network or internet in nature. . The media capabilities of this digital era make it easier for people to receive information
more quickly.(Megawati, 2021)
The role of technology is so important that it is starting to bring civilization into the digital era and increasingly
rapidly and the current development of communication technology has changed people's habitual tendencies in
expressing their opinions on social media. One of the social media that is popular among internet users today is Twitter.
The most popular social media for expressing opinions is Twitter (Syarifuddin, 2020). Twitter is a social media that
prioritizes socializing using text, although the new version supports video and photo formats to support tweets. In this
way, Twitter is the right tool for collecting public sentiment data on the internet.(Asro’i & Februariyanti, 2022)
Twitter is a social media that allows users to share information with other people in real time. Information shared
on Twitter is usually called a tweet (Twett).(Elsa Annisa Batu Bara et al., 22 C.E.). Indonesia is in fifth position as the
country with the most Twitter users after England and other large countries (Kominfo.go.id, 2020). With so many social
media users, social media marketing has become a strategic key in marketing activities in the world. This is proven by
94% of companies in the world using social media for marketing purposes. Likewise, cellular operator companies use
Twitter as a promotional medium, one of which is Indosat. Indosat currently has 1.5 million followers with official
usernames. Indosat is one of the largest cellular operators in Indonesia. However, Indosat also does not escape the
comments of its users with various kinds of positive and negative comments. The large number of Indosat cellular
operator users who submit comments can be used to search for information to analyze comments on Twitter using
sentiment analysis.(Syailendra Reza Irwansyah Rezeki et al., 2020)
Sentiment analysis is a branch of research in the text mining domain where the process of identifying and
extracting sentiment data is usually categorized based on its polarity, whether positive, negative or neutral. We can
process opinion data on Twitter using data mining techniques, namely classification. To group sentiment, the author
divides 3 indicators, namely positive, negative and neutral sentiment with indicators based on tweets. Currently Twitter
This is an Creative Commons License This work is licensed under a Creative
Commons Attribution-NonCommercial 4.0 International License.
245
E-ISSN : 2807-9035
Volume 4, Number 1, May 2024
https://doi.org/10.47709/brilliance.v4i1.4084
is a good indicator for influencing research. This sentiment analysis is carried out to determine public sentiment about
something using a machine learning approach (Putranti & Winarko, 2014)
LITERATURE REVIEW
Sentiment Analysis on Twitter Regarding Post-Disaster Using the Naïve Bayes Method with the N-Gram
Feature(Rozi et al., 2023), explained that the Naïve Bayes Classifier Algorithm can be used to classify tweets into
positive or negative, especially tweets regarding post-disaster . And testing the accuracy of the algorithm which was
carried out by manually labeling 15 respondents, it was found that the results from unigrams and bigrams had quite
significant differences. From these four tests, the highest accuracy results were obtained for unigrams, namely 93.33%
and bigrams, 86.67%. Classification of public opinion towards the MyRepublicId ISP service using a dataset on
Twitter and carried out by applying the naïve Bayes method produces accuracy in the categories of positive 0.976%,
neutral 0.833%, and negative 0.82895%, with an average value of 0.87949%, the theory is presented by Hafiz Irsyad,
Ahmad Farisi, Muhammad Rizky Pribadi in his research entitled "Classification of Public Opinion on MyRepublic ISP
Services using Naive Bayes"(Hafiz Irsyad et al., 2019)
Dedi Darwis(Darwis et al., 2021) in their research "Application of the Naive Bayes Algorithm for Sentiment
Analysis Review of National BMKG Data" explains that the process of extracting data from National BMKG Twitter
uses the Python 3.74 programming language with preprocessi (...truncated)