Sentiment Analysis of Gojek Driver Application Reviews Using Support Vector Machine and Naïve Bayes with Optuna-Based Hyperparameter Tuning
Journal of Deep Learning, Computer Vision and Digital Image Processing
E-ISSN: 2986-8939 P-ISSN: 2986-8920
Vol. 4, No. 1, March 2026
DOI. https://doi.org/10.61255/decoding.v4i1.948
Sentiment Analysis of Gojek Driver Application Reviews Using
Support Vector Machine and Naïve Bayes with Optuna-Based
Hyperparameter Tuning
Nadilla Madjid
Universitas Trilogi, Jakarta,
Indonesia
Rudi Setiawan
Universitas Trilogi, Jakarta,
Indonesia
ABSTRACT
ARTICLE HISTORY
Received: 6 February,
2026
Revised: 3 March, 2026
Accepted: 27 March,
2026
KEYWORDS
Hyperparameter
Tuning;
Naïve Bayes;
Optuna;
Sentiment Analysis;
Support Vector Machine
Purpose – This study aims to analyze user review sentiment toward the Gojek
Driver application and compare the performance of two classification algorithms,
Support Vector Machine (SVM) and Naïve Bayes, using Optuna as a framework
for hyperparameter tuning.
Methods – The study collected and labeled user review data into positive and
negative sentiment categories. Text preprocessing involved cleaning, case
folding, normalization, tokenization, stopword removal, and stemming. Features
were represented using TF-IDF. The dataset was then divided into training and
testing sets, and SVM and Naïve Bayes models were trained using automated
hyperparameter optimization with Optuna. Model performance was evaluated
using accuracy, precision, recall, F1-score, and confusion matrix.
Findings – The application of SMOTE to the Optuna-tuned SVM model produced
better performance than the other models tested in this study. The best model
achieved an accuracy of 0.868, a highest cross-validation accuracy of 92.72%, and
a weighted average F1-score of 0.87. These results indicate that SVM was more
effective in handling high-dimensional TF-IDF features and complex decision
boundaries.
Research implications – The findings support the use of automated sentiment
analysis to assist operational decision-making and improve the quality of Gojek
Driver services. The proposed approach can accelerate the identification of
service-related issues and provide a basis for proactive responses to user
feedback.
Originality – This study offers an original contribution by directly comparing
SVM and Naïve Bayes on a Gojek Driver review dataset while applying Optunabased hyperparameter tuning. It highlights the effect of automated tuning on
both algorithms within a TF-IDF representation framework for ride-hailing
service data, a topic that remains underexplored in the specific context of Gojek
Driver within the local literature.
Correspondence Author:
To cite this article : N. Madjid & R. Setiawan. (2026). Sentiment Analysis of Gojek Driver Application Reviews
Using Support Vector Machine and Naïve Bayes with Optuna-Based Hyperparameter Tuning. Journal of Deep
Learning, Computer Vision and Digital Image Processing, 4(1), 1-14.
https://doi.org/10.61255/decoding.v4i1.948
This is an open access article under the CC BY-SA license
Page 1 of 14
Journal of Deep Learning, Computer Vision and Digital Image Processing
N. Madjid and R.Setiawan
│
Analisis Sentimen Aplikasi Gojek Driver
INTRODUCTION
Digital service applications such as Gojek Driver have become an important part of everyday life in
Indonesia, particularly in supporting mobility, platform-based work, and service interaction between
drivers and users. As a large-scale digital platform, Gojek Driver receives substantial user feedback
through platforms such as the Google Play Store. These reviews provide valuable information about
service quality, user experience, technical problems, and perceived platform performance[1], [2], [3],
[4]. However, manually evaluating large volumes of user reviews is inefficient and difficult to sustain,
especially when the data continue to grow over time [5], [6]. Automated sentiment analysis is
therefore necessary to classify user opinions into positive and negative sentiment categories and to
support service quality improvement in digital platform ecosystems.
Previous studies have applied various machine learning algorithms for sentiment analysis in digital
application contexts. Naïve Bayes has been widely used in sentiment classification because of its
simplicity and efficiency [7], [8], [9]. Decision Tree algorithms have also been applied in several
sentiment analysis studies involving user feedback and application reviews [10], [11], [12]. Other
studies have employed K-Nearest Neighbor to classify sentiment patterns in user-generated textual
data [13], [14], [15], while Support Vector Machine has frequently been used because of its strong
performance in text classification tasks [16], [17], [18]. A comparative study on sentiment
classification for the Satu Sehat application reported that SVM achieved higher accuracy, at 87.95%,
than Naïve Bayes, which reached 81.65% [19]. This finding suggests that SVM can perform better in
sentiment classification, particularly when dealing with imbalanced data, which remains one of the
main challenges in sentiment analysis.
Although prior studies have contributed to the development of sentiment analysis for digital
applications, several methodological limitations remain. Many existing studies have not fully
integrated lexicon-based automatic labeling, imbalance handling through the Synthetic Minority
Over-sampling Technique, and automated hyperparameter optimization using frameworks such as
Optuna. These components are important because sentiment datasets from application reviews often
contain unequal class distributions, noisy informal language, and high-dimensional textual features.
Without proper imbalance handling and parameter optimization, classification models may produce
biased performance, particularly toward the majority class. This issue is especially relevant in the
context of user reviews, where negative and positive sentiment may not be evenly distributed.
In this context, the Gojek Driver application provides a relevant case for sentiment analysis. The
application has a large and diverse user base, and its review data reflect various user perceptions of
platform reliability, service quality, technical functionality, and operational experience. This study
analyzes user sentiment toward the Gojek Driver application by comparing two widely used
classification algorithms, namely Naïve Bayes and Support Vector Machine. Naïve Bayes was selected
because of its ability to perform classification efficiently even with limited data [20], [21]. SVM was
selected because of its capability to construct strong decision margins and its suitability for
imbalanced text classification problems [22], [23], [24]. Previous studies also indicate that SVM can
achieve strong performance when combined with SMOTE for imbalance handling [25], [26].
This study further integrates Optuna-based hyperparameter tuning to automatically optimize model
parameters and improve classification performance. By combining SMOTE and stratified train-test
splitting, this study examines whethe (...truncated)