Sentiment Analysis on Social Media X (Twitter) Against ChatGBT Using the K-Nearest Neighbors Algorithm
E-ISSN : 2807-9035
Volume 4, Number 1, May 2024
https://doi.org/10.47709/brilliance.v4i1.4105
Sentiment Analysis on Social Media X (Twitter) Against ChatGBT Using the K-Nearest
Neighbors Algorithm
Asep Arwan Sulaeman1*, Muhtajuddin Danny2, Sufajar Butsianto3, Suria Pratama4
1,2,3,4
1
Informatics Engineering Study Program, Faculty of Engineering, Pelita Bangsa University, Indonesia
, , ,
*Corresponding Author
Article History:
Submitted: 15-06-2024
Accepted: 16-06-2024
Published: 01-07-2024
Keywords:
ChatGPT; K-Nearest Neighbors
(KNN); sentiment analysis;
Twitter.
Brilliance: Research of
Artificial Intelligence is licensed
under a Creative Commons
Attribution-NonCommercial 4.0
International (CC BY-NC 4.0).
ABSTRACT
This research aims to analyze the public's response to ChatGPT through data
obtained from Twitter. Apart from that, it is also to understand whether people's
responses tend to be positive or negative towards ChatGPT, as well as to test the
performance of the K-Nearest Neighbors (KNN) method in classifying
sentiment patterns in tweet data. The sentiment analysis method is carried out by
dividing public responses into positive and negative categories. Next, the
performance of the K-Nearest Neighbors (KNN) method was tested with
varying k values to classify sentiment patterns in tweet data. This testing
includes dataset division, vectorization of text data using TF-IDF, initialization
and training of the KNN model, and evaluation of model performance using
metrics such as precision, recall, and f1-score. The results of sentiment analysis
show that the majority of people's responses to ChatGPT are positive (74.3%),
while 25.7% of responses are negative. Performance testing of the KNN model
shows that the highest accuracy of 88% is achieved when the k value is 5.
Evaluation of model performance also shows satisfactory levels of precision,
recall and f1-score. Based on the research results, it was concluded that
sentiment analysis and classification using KNN were effective in understanding
people's responses to ChatGPT.
INTRODUCTION
In a situation of increasingly rapid advances in information and communication technology (Triyono & Febriani,
2018), ChatGPT has become an innovation that has attracted attention from the public. ChatGPT is a natural language
processing technology or what is called (natural language processing / NLP) which is capable of receiving commands
from users and can also answer user questions in the form of text or what can be called prompts that are entered into the
application .
Sentiment analysis is a method of extracting information from text which aims to understand whether an opinion
or assessment from internet users on social media is positive, neutral or negative. This technique is used to evaluate
personal opinions expressed by users, providing insight into their views on something (Fransiska Vina Sari & Arief
Wibowo, 2019). Sentiment analysis can provide valuable insight into how users evaluate ChatGPT's performance and
help identify potential improvements (Lubis et al., 2024). By continuously monitoring user sentiment, the development
team can respond to feedback more effectively, ensuring that ChatGPT continues to evolve according to user
expectations and needs (Irwansyah Suwahyu et al., 2024).
This research uses the KNN classification method, namely K-Nearest Neighbors, this method was chosen
because it has the advantage of classifying data based on the level of similarity with the nearest neighbors. KNN as a
classification method is classified as a relatively easy approach without requiring complicated calculations (Endang
Sholihatin et al., 2023). By using this method, it is hoped that this method can identify sentiment patterns that appear in
tweets on Twitter (Novianti & Wibowo, 2022).
The existence of ChatGPT as a smart language processing technology that is being widely discussed in various
places has given rise to various views, both positive and negative, from users. Therefore, this study aims to investigate
what people think about ChatGPT, whether their views tend to be positive or negative. In this research, the author will
use a simple approach with the K-Nearest Neighbors (KNN) Method to better understand how people see the
advantages and disadvantages of ChatGPT. Thus, this research not only helps understand people's overall opinion of
ChatGPT, but also contributes to further understanding of how language processing technology is received by society
(Irsalinda et al., 2021).
LITERATURE REVIEW
Faiza Rizqi Irawan, Ahmad Jazuli, Tutik Khotimah (Rizqi Irawan, 2022), in Sentiment Analysis Of Gojek Users
Using The K-Nearset Neighbors Method explained that the application of the K-Nearest Neighbor method in classifying
Twitter user responses can be a basis for evaluating and assessing Gojek services for the company. Testing this method
using a confusion matrix on a dataset of 1409 shows an accuracy level of 79.43% with a value of k=15.
This is an Creative Commons License This work is licensed under a Creative
Commons Attribution-NonCommercial 4.0 International License.
265
E-ISSN : 2807-9035
Volume 4, Number 1, May 2024
https://doi.org/10.47709/brilliance.v4i1.4105
The research process begins with collecting data (crawling data), managing data (data preprocessing), labeling
the data (labeling), classifying using the KNN algorithm, and finally, carrying out evaluation. The evaluation results
show an accuracy level of 94.33% in classifying data. This opinion is the result of research by Muhamad Trian
Diwandanu, Lu'lu Mawaddah Wisudawati (Diwandanu & Wisudawati, 2023), in his journal entitled Sentiment Analysis
of Twit Maxim On Twitter Using R Programming And K Nearest Neighbors.
The Flouting Maxim on Twitter Influencers’ Tweets, this research aims to determine the use of maxim principles
in tweets made by certain social media influencers in Indonesia whose method was carried out qualitatively. This
research is limited to whether users comply with cooperative principles, maxims, especially the maxim of relevance,
what purpose users usually violate these maxims(Hassani, 2019). The results obtained vary: most of the conversations
do not meet the principle of the maxim of relevance, or in other words do not imply the principle of the maxim of
relevance. Moreover, the goal is to crack jokes, and to keep the conversation going smoothly while engaging in good
manners (Syaifuddin et al., 2021).
Analysis Of Sentiment Towards The Community For New Banknote Using The K-Nearest Neighbor (KNN)
Algorithm is research from Septi Hasanah, Intan Purwasih, Imam Santoso (Septi Hasanah et al., 2023). The research
used K-Nearest Neighbor on 510 data for sentiment analysis, achieving an accuracy of 75.06%. The process involves
data crawling, pre-processing, and classification with RapidMiner. Evaluation was carried out by taking True Positives
and True Negatives, showing positive sentiment of 67.74% and negative 76.65%. In conclusion, the (...truncated)