COMPARISON OF PORTERS STEMMING ALGORITHM AND NAZIEF & ADRIANI'S STEMMING ALGORITHM IN DETERMINING INDONESIAN LANGUAGE LEARNING MODULES
Vol. 18, No. 2 September 2022 | DOI: 10.33480/pilar.v18i2.3940
203
COMPARISON OF PORTER'S STEMMING ALGORITHM AND NAZIEF &
ADRIANI'S STEMMING ALGORITHM IN DETERMINING INDONESIAN
LANGUAGE LEARNING MODULES
Siti Tuhpatussania1 ; Ema Utami2 ; Anggit Dwi Hartanto3
Informatics Engineering Study Program
AMIKOM Yogyakarta University
Yogyakarta, Indonesia
www.amikom.ac.id
1 ; 2 ; 3
(*) Corresponding Author
Abstract ___ One of the methods used to improve the
performance of text summarization to obtain
complete information in a learning module is by
transforming the words in a module into basic
words or, in other words, through a steaming
process. The steaming process in Indonesian
language texts is more complicated/complex
because there are word affixes that must be
removed to get the root word (root word) of a word,
so this research will compare the two stemming
algorithms of Porter and stemming Nazief & Adriani
in the learning module at Mataram University of
Technology. The test results of the Nazief & Adriani
stemming algorithm on an average process duration
of 51.8 seconds with an average accuracy of
74.175%. In Porter's Algorithm, the average
processing time is 16.875 seconds, with an accuracy
of 73.225%.
Keywords: Indonesian stemming, porter algorithm,
nazief & adriani algorithm.
Abstrak___ Salah satu cara yang digunakan untuk
meningkatkan performa peringkasan teks agar
mendapatkan informasi yang utuh pada sebuah
modul pembelajaran yaitu dengan mentransformasi
kata-kata dalam sebuah modul tersebut ke kata
dasarnya atau dengan kata lain melalui proses
stemming. Proses stemming pada teks berbahasa
Indonesia lebih rumit/kompleks karena terdapat
imbuhan kata yang harus dibuang untuk
mendapatkan root word (kata dasar) dari sebuah
kata sehingga pada penelitian ini akan
membandingkan dua algoritma stemming Porter
dan stemming Nazief & Adriani pada modul
pembelajaran di Universitas Teknologi Mataram.
Hasil pengujian dari algoritma stemming Nazief &
Adriani pada durasi proses rata-rata 51,8 detik
dengan akurasi rata-rata 74,175%. Sedangkan pada
algoritma Porter durasi proses rata-rata 16,875
detik dengan akurasi 73,225%.
Kata kunci : stemming bahasa Indonesia, algoritma
porter, algoritma nazief & adriani.
INTRODUCTION
Source information for one of the students
could be obtained in a module usually used as a
medium of learning by students or female students.
The learning module is the driving medium student
for study in a manner independent as well as help
the lecturer in conveying material to achieve the aim
of learning(Laurensius Setyabudhi & Sanusi, 2019).
However, in general, module learning contains
hundreds of pages, which makes students reluctant
to dig for further information, so the information
you get is not accurately related to the courses
taken, for the needed summary text from every
module learning needs students to search for
effective and efficient information. Effective means
the user gets relevant information with entered
queries. Efficient means time short search.
Stemming is one method used for upgrading
performance summary text with the method
transforming the words in a module learning to the
basics for then the base word is given weight to
achieve aim summary text that can represent the
whole from the document original (Jatikusumo &
Derajad Wijaya, 2021). The stemming algorithm will
be different for every language (Simanjuntak,
2022). As examples of English and Indonesian have
different morphology, the algorithm stemming
needed will be different. The process of stemming
the text is more Indonesian tricky/complex because
a must-word affix is thrown away to get the root
word (base word) from a word(Simanjuntak, 2022).
Word formation could be done with
affixation, reduplication, and compositum(Harja
Susetya & Harja Susetya, 2022). Affixation (e.g.,
"ajar" to be "belajar"), reduplication (e.g., "meja"
Comparison of Porter's Stemming…
204
Vol. 18, No. 2 September 2022 | DOI: 10.33480/pilar.v18i2.3940
becomes meja-meja), and composite (e.g., "mata"
becomes "matahari"). Moreover, Indonesian has 35
official affixes in the dictionary big language
Indonesia and affixes could form prefixes (prefixes),
suffixes (suffixes), confixes, or absorbable infixes
from the language java. With many Indonesian
rules, the proper stemming algorithm was needed
to define base words text Indonesian.
Study-related basic word determination or
stemming
Indonesian
ever
carried
out
by(Damayanti, 2022). The purpose of looking for a
synonym is to reduce plagiarism. Search for the
similarity of words is not a synonym but a similarity
or a word close to the original word in comparison.
In a study, this is the data used in two creation write
scientific where document the will visit stages
preprocessing up to stemming using algorithm
Nazief & Adriani for getting the root word which is
then the root word the used as reference level
similarity two creation write scientifically. The
stemming algorithm Nazief-Adriani produces more
accuracy _ good because it could apply with an
arrangement to return the word that has been
stemmed against _ creation written scientifically.
A related study was carried out by (Mandar
et al., 2020), who classified Indonesian news using
Naïve Bayes with Porter stemming. The accuracy
results of 15 news data use the second method, the
produce mark percentage of 79% with Precision
value, i.e., 1.000, Recall 0.6666, F-Measure 0.7951.
In previous research, the Porter stemming
algorithm and Nazief & Adriani stemming had their
respective advantages in determining basic words,
but the stemming documents used were relatively
small and neatly arranged, thus motivating this
study to use the stemming algorithm in determining
the base words of a learning module. The number of
words in one document is an average of 15,000 and
is not structured. Comparison of Porter's stemming
algorithm and Nazief & Adriani's stemming
algorithm in this study in order to be able to
determine the algorithm that gets the best accuracy
and duration of the steaming process in the
Indonesian language learning module
MATERIALS AND METHODS
Stemming's research will use the method of
descriptive-qualitative. The method to investigate
inherent truth relative and theoretical and use
hermeneutics as a Step for looking for meaning and
interpretation(Zaluchu, 2021). At the same time,
study descriptive is one method of examining the
status of a group of humans, one subject, a set of
Comparison of Porter's Stemming …
conditions, a system of thinking, or even a class of
current events (Aprilliwanto et al., 2021). Research
is done with the qualitative method, usually served
descriptively.
In research, this process involves data processing
through stages of preparation data or preparing a
dataset in the form of module learning with pdf and
docx formats that are not locked. Next, the data
preprocessing stages, namely the process of solving
the word from document single be the root word,
and the last Step evaluation, (...truncated)