COMPARISON OF PORTERS STEMMING ALGORITHM AND NAZIEF & ADRIANI'S STEMMING ALGORITHM IN DETERMINING INDONESIAN LANGUAGE LEARNING MODULES (pdf)

Article PDF cannot be displayed. You can download it here:

https://ejournal.nusamandiri.ac.id/index.php/pilar/article/download/3940/991

COMPARISON OF PORTERS STEMMING ALGORITHM AND NAZIEF & ADRIANI'S STEMMING ALGORITHM IN DETERMINING INDONESIAN LANGUAGE LEARNING MODULES

Vol. 18, No. 2 September 2022 | DOI: 10.33480/pilar.v18i2.3940 203 COMPARISON OF PORTER'S STEMMING ALGORITHM AND NAZIEF & ADRIANI'S STEMMING ALGORITHM IN DETERMINING INDONESIAN LANGUAGE LEARNING MODULES Siti Tuhpatussania1 ; Ema Utami2 ; Anggit Dwi Hartanto3 Informatics Engineering Study Program AMIKOM Yogyakarta University Yogyakarta, Indonesia www.amikom.ac.id 1 ; 2 ; 3 (*) Corresponding Author Abstract ___ One of the methods used to improve the performance of text summarization to obtain complete information in a learning module is by transforming the words in a module into basic words or, in other words, through a steaming process. The steaming process in Indonesian language texts is more complicated/complex because there are word affixes that must be removed to get the root word (root word) of a word, so this research will compare the two stemming algorithms of Porter and stemming Nazief & Adriani in the learning module at Mataram University of Technology. The test results of the Nazief & Adriani stemming algorithm on an average process duration of 51.8 seconds with an average accuracy of 74.175%. In Porter's Algorithm, the average processing time is 16.875 seconds, with an accuracy of 73.225%. Keywords: Indonesian stemming, porter algorithm, nazief & adriani algorithm. Abstrak___ Salah satu cara yang digunakan untuk meningkatkan performa peringkasan teks agar mendapatkan informasi yang utuh pada sebuah modul pembelajaran yaitu dengan mentransformasi kata-kata dalam sebuah modul tersebut ke kata dasarnya atau dengan kata lain melalui proses stemming. Proses stemming pada teks berbahasa Indonesia lebih rumit/kompleks karena terdapat imbuhan kata yang harus dibuang untuk mendapatkan root word (kata dasar) dari sebuah kata sehingga pada penelitian ini akan membandingkan dua algoritma stemming Porter dan stemming Nazief & Adriani pada modul pembelajaran di Universitas Teknologi Mataram. Hasil pengujian dari algoritma stemming Nazief & Adriani pada durasi proses rata-rata 51,8 detik dengan akurasi rata-rata 74,175%. Sedangkan pada algoritma Porter durasi proses rata-rata 16,875 detik dengan akurasi 73,225%. Kata kunci : stemming bahasa Indonesia, algoritma porter, algoritma nazief & adriani. INTRODUCTION Source information for one of the students could be obtained in a module usually used as a medium of learning by students or female students. The learning module is the driving medium student for study in a manner independent as well as help the lecturer in conveying material to achieve the aim of learning(Laurensius Setyabudhi & Sanusi, 2019). However, in general, module learning contains hundreds of pages, which makes students reluctant to dig for further information, so the information you get is not accurately related to the courses taken, for the needed summary text from every module learning needs students to search for effective and efficient information. Effective means the user gets relevant information with entered queries. Efficient means time short search. Stemming is one method used for upgrading performance summary text with the method transforming the words in a module learning to the basics for then the base word is given weight to achieve aim summary text that can represent the whole from the document original (Jatikusumo & Derajad Wijaya, 2021). The stemming algorithm will be different for every language (Simanjuntak, 2022). As examples of English and Indonesian have different morphology, the algorithm stemming needed will be different. The process of stemming the text is more Indonesian tricky/complex because a must-word affix is thrown away to get the root word (base word) from a word(Simanjuntak, 2022). Word formation could be done with affixation, reduplication, and compositum(Harja Susetya & Harja Susetya, 2022). Affixation (e.g., "ajar" to be "belajar"), reduplication (e.g., "meja" Comparison of Porter's Stemming… 204 Vol. 18, No. 2 September 2022 | DOI: 10.33480/pilar.v18i2.3940 becomes meja-meja), and composite (e.g., "mata" becomes "matahari"). Moreover, Indonesian has 35 official affixes in the dictionary big language Indonesia and affixes could form prefixes (prefixes), suffixes (suffixes), confixes, or absorbable infixes from the language java. With many Indonesian rules, the proper stemming algorithm was needed to define base words text Indonesian. Study-related basic word determination or stemming Indonesian ever carried out by(Damayanti, 2022). The purpose of looking for a synonym is to reduce plagiarism. Search for the similarity of words is not a synonym but a similarity or a word close to the original word in comparison. In a study, this is the data used in two creation write scientific where document the will visit stages preprocessing up to stemming using algorithm Nazief & Adriani for getting the root word which is then the root word the used as reference level similarity two creation write scientifically. The stemming algorithm Nazief-Adriani produces more accuracy _ good because it could apply with an arrangement to return the word that has been stemmed against _ creation written scientifically. A related study was carried out by (Mandar et al., 2020), who classified Indonesian news using Naïve Bayes with Porter stemming. The accuracy results of 15 news data use the second method, the produce mark percentage of 79% with Precision value, i.e., 1.000, Recall 0.6666, F-Measure 0.7951. In previous research, the Porter stemming algorithm and Nazief & Adriani stemming had their respective advantages in determining basic words, but the stemming documents used were relatively small and neatly arranged, thus motivating this study to use the stemming algorithm in determining the base words of a learning module. The number of words in one document is an average of 15,000 and is not structured. Comparison of Porter's stemming algorithm and Nazief & Adriani's stemming algorithm in this study in order to be able to determine the algorithm that gets the best accuracy and duration of the steaming process in the Indonesian language learning module MATERIALS AND METHODS Stemming's research will use the method of descriptive-qualitative. The method to investigate inherent truth relative and theoretical and use hermeneutics as a Step for looking for meaning and interpretation(Zaluchu, 2021). At the same time, study descriptive is one method of examining the status of a group of humans, one subject, a set of Comparison of Porter's Stemming … conditions, a system of thinking, or even a class of current events (Aprilliwanto et al., 2021). Research is done with the qualitative method, usually served descriptively. In research, this process involves data processing through stages of preparation data or preparing a dataset in the form of module learning with pdf and docx formats that are not locked. Next, the data preprocessing stages, namely the process of solving the word from document single be the root word, and the last Step evaluation, (...truncated)