Sentiment Analysis of Gojek Driver Application Reviews Using Support Vector Machine and Naïve Bayes with Optuna-Based Hyperparameter Tuning (pdf)

Article PDF cannot be displayed. You can download it here:

https://journal.diginus.id/DECODING/article/download/948/503

Sentiment Analysis of Gojek Driver Application Reviews Using Support Vector Machine and Naïve Bayes with Optuna-Based Hyperparameter Tuning

Journal of Deep Learning, Computer Vision and Digital Image Processing E-ISSN: 2986-8939 P-ISSN: 2986-8920 Vol. 4, No. 1, March 2026 DOI. https://doi.org/10.61255/decoding.v4i1.948 Sentiment Analysis of Gojek Driver Application Reviews Using Support Vector Machine and Naïve Bayes with Optuna-Based Hyperparameter Tuning Nadilla Madjid Universitas Trilogi, Jakarta, Indonesia Rudi Setiawan Universitas Trilogi, Jakarta, Indonesia ABSTRACT ARTICLE HISTORY Received: 6 February, 2026 Revised: 3 March, 2026 Accepted: 27 March, 2026 KEYWORDS Hyperparameter Tuning; Naïve Bayes; Optuna; Sentiment Analysis; Support Vector Machine Purpose – This study aims to analyze user review sentiment toward the Gojek Driver application and compare the performance of two classification algorithms, Support Vector Machine (SVM) and Naïve Bayes, using Optuna as a framework for hyperparameter tuning. Methods – The study collected and labeled user review data into positive and negative sentiment categories. Text preprocessing involved cleaning, case folding, normalization, tokenization, stopword removal, and stemming. Features were represented using TF-IDF. The dataset was then divided into training and testing sets, and SVM and Naïve Bayes models were trained using automated hyperparameter optimization with Optuna. Model performance was evaluated using accuracy, precision, recall, F1-score, and confusion matrix. Findings – The application of SMOTE to the Optuna-tuned SVM model produced better performance than the other models tested in this study. The best model achieved an accuracy of 0.868, a highest cross-validation accuracy of 92.72%, and a weighted average F1-score of 0.87. These results indicate that SVM was more effective in handling high-dimensional TF-IDF features and complex decision boundaries. Research implications – The findings support the use of automated sentiment analysis to assist operational decision-making and improve the quality of Gojek Driver services. The proposed approach can accelerate the identification of service-related issues and provide a basis for proactive responses to user feedback. Originality – This study offers an original contribution by directly comparing SVM and Naïve Bayes on a Gojek Driver review dataset while applying Optunabased hyperparameter tuning. It highlights the effect of automated tuning on both algorithms within a TF-IDF representation framework for ride-hailing service data, a topic that remains underexplored in the specific context of Gojek Driver within the local literature. Correspondence Author:  To cite this article : N. Madjid & R. Setiawan. (2026). Sentiment Analysis of Gojek Driver Application Reviews Using Support Vector Machine and Naïve Bayes with Optuna-Based Hyperparameter Tuning. Journal of Deep Learning, Computer Vision and Digital Image Processing, 4(1), 1-14. https://doi.org/10.61255/decoding.v4i1.948 This is an open access article under the CC BY-SA license Page 1 of 14 Journal of Deep Learning, Computer Vision and Digital Image Processing N. Madjid and R.Setiawan │ Analisis Sentimen Aplikasi Gojek Driver INTRODUCTION Digital service applications such as Gojek Driver have become an important part of everyday life in Indonesia, particularly in supporting mobility, platform-based work, and service interaction between drivers and users. As a large-scale digital platform, Gojek Driver receives substantial user feedback through platforms such as the Google Play Store. These reviews provide valuable information about service quality, user experience, technical problems, and perceived platform performance[1], [2], [3], [4]. However, manually evaluating large volumes of user reviews is inefficient and difficult to sustain, especially when the data continue to grow over time [5], [6]. Automated sentiment analysis is therefore necessary to classify user opinions into positive and negative sentiment categories and to support service quality improvement in digital platform ecosystems. Previous studies have applied various machine learning algorithms for sentiment analysis in digital application contexts. Naïve Bayes has been widely used in sentiment classification because of its simplicity and efficiency [7], [8], [9]. Decision Tree algorithms have also been applied in several sentiment analysis studies involving user feedback and application reviews [10], [11], [12]. Other studies have employed K-Nearest Neighbor to classify sentiment patterns in user-generated textual data [13], [14], [15], while Support Vector Machine has frequently been used because of its strong performance in text classification tasks [16], [17], [18]. A comparative study on sentiment classification for the Satu Sehat application reported that SVM achieved higher accuracy, at 87.95%, than Naïve Bayes, which reached 81.65% [19]. This finding suggests that SVM can perform better in sentiment classification, particularly when dealing with imbalanced data, which remains one of the main challenges in sentiment analysis. Although prior studies have contributed to the development of sentiment analysis for digital applications, several methodological limitations remain. Many existing studies have not fully integrated lexicon-based automatic labeling, imbalance handling through the Synthetic Minority Over-sampling Technique, and automated hyperparameter optimization using frameworks such as Optuna. These components are important because sentiment datasets from application reviews often contain unequal class distributions, noisy informal language, and high-dimensional textual features. Without proper imbalance handling and parameter optimization, classification models may produce biased performance, particularly toward the majority class. This issue is especially relevant in the context of user reviews, where negative and positive sentiment may not be evenly distributed. In this context, the Gojek Driver application provides a relevant case for sentiment analysis. The application has a large and diverse user base, and its review data reflect various user perceptions of platform reliability, service quality, technical functionality, and operational experience. This study analyzes user sentiment toward the Gojek Driver application by comparing two widely used classification algorithms, namely Naïve Bayes and Support Vector Machine. Naïve Bayes was selected because of its ability to perform classification efficiently even with limited data [20], [21]. SVM was selected because of its capability to construct strong decision margins and its suitability for imbalanced text classification problems [22], [23], [24]. Previous studies also indicate that SVM can achieve strong performance when combined with SMOTE for imbalance handling [25], [26]. This study further integrates Optuna-based hyperparameter tuning to automatically optimize model parameters and improve classification performance. By combining SMOTE and stratified train-test splitting, this study examines whethe (...truncated)