Comparison Of K-Means and K-Medoids Algorithm for Clustering Data UMKM in Pagar Alam City

Jurnal Sisfokom (Sistem Informasi dan Komputer), Jun 2024

The aim of this research is clustering MSME data in Pagar Alam City using the K-Means and K-Medoids algorithms. This research is motivated by the lack of further management of MSME data collection, which can hinder the development and improvement of Pagar Alam City MSMEs. Meanwhile, this data is considered necessary for agencies to develop and improve Pagar Alam City MSMEs. Apart from agencies, this data is also useful for sub-districts, sub-districts and RT/RW to find out what interests, talents and potential the community has in what business fields. MSME data is processed using Rapid Miner and Python, the system development method in this research uses the Cross Industry Standard Process for Data Mining (CRISP-DM) method, where the stages include Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. The test method uses the Davies-bouldin index, a DBI value that is close to 0 results in good clustering. The results of this research obtained 3 clusters. In 2020 K-Means C0= 1, C1= 3 and C2= 1 sub-district, K-Medoids C0= 1, C1= 1 and C2= 3 sub-district. In 2022 K-Means C0= 1, C1= 3 and C2= 1 sub-district, K-Medoids C0= 1, C1= 3 and C2= 1 sub-districts. The results of the 2020 sub-district DBI clustering calculation are DBI k-means = 0.134 and k-medoids = 0.523. In 2022 DBI k-means = 0.277 and k-medoids = 0.496. So it can be concluded that the K-Means algorithm in the case of grouping MSMEs in Pagar Alam City has better performance, because the DBI value is close to 0. From the results of the grouping it can help provide an overview for related parties in encouraging or providing assistance to sub-districts that are included in the low cluster.

Article PDF cannot be displayed. You can download it here:

https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/download/2090/988

Comparison Of K-Means and K-Medoids Algorithm for Clustering Data UMKM in Pagar Alam City

Jurnal SISFOKOM (Sistem Informasi dan Komputer), Volume 13, Nomor 02, PP 193-199 Comparison Of K-Means and K-Medoids Algorithm for Data Clustering UMKM in Pagar Alam City Sendy Ariska [1]*, Desi Puspita [2], Inda Anggraini [3] Institut Teknologi Pagar Alam Pagar Alam, Indonesia [1], [2], [3] Abstract— The aim of this research is clustering MSME data in Pagar Alam City using the K-Means and K-Medoids algorithms. This research is motivated by the lack of further management of MSME data collection, which can hinder the development and improvement of Pagar Alam City MSMEs. Meanwhile, this data is considered necessary for agencies to develop and improve Pagar Alam City MSMEs. Apart from agencies, this data is also useful for sub-districts, sub-districts and RT/RW to find out what interests, talents and potential the community has in what business fields. MSME data is processed using Rapid Miner and Python, the system development method in this research uses the Cross Industry Standard Process for Data Mining (CRISP-DM) method, where the stages include Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. The test method uses the Davies-bouldin index, a DBI value that is close to 0 results in good clustering. The results of this research obtained 3 clusters. In 2020 K-Means C0= 1, C1= 3 and C2= 1 sub-district, KMedoids C0= 1, C1= 1 and C2= 3 sub-district. In 2022 K-Means C0= 1, C1= 3 and C2= 1 sub-district, K-Medoids C0= 1, C1= 3 and C2= 1 sub-districts. The results of the 2020 sub-district DBI clustering calculation are DBI k-means = 0.134 and k-medoids = 0.523. In 2022 DBI k-means = 0.277 and k-medoids = 0.496. So it can be concluded that the K-Means algorithm in the case of grouping MSMEs in Pagar Alam City has better performance, because the DBI value is close to 0. From the results of the grouping it can help provide an overview for related parties in encouraging or providing assistance to sub-districts that are included in the low cluster. Keywords— K-Means, K-Medoids, CRISP-DM, Davies-bouldin index. I. INTRODUCTION Along with the development of the internet, the data stored, both in the form of text, images, sound and video, has also increased very quickly and significantly. As a result, large volumes of data will become "garbage" in storage if they are not processed into useful information, requiring a technique or method called data mining. [1]. Data mining, also called Knowledge Discovery in Databases (KDD), is defined as the extraction of potential, implicit and unknown information from a set of data. The Knowlegde Discovery in Database process involves the results of the process of extracting trends in a data pattern, then converting the results accurately into information that is easy to understand. There are several roles that can be used in data mining, one of which is clustering [2]. Clustering is a method of analyzing data, which is often included as one of the data mining methods, the aim of which is to group data with different characteristics into other "regions" [3], One method for clustering is K-Means and KMedoids. K-Means algorithm is a clustering algorithm that groups data based on the closest cluster center point (Centroid) to the data [4]. K-Medoids algorithm or Partitioning Around Medoids (PAM) is a partition clustering method for grouping a set of (n) objects into a number of (k) clusters [5]. Based on the results of observations and interviews in the cooperative sector at the Department of Industry, Trade and UKM Cooperatives in Pagar Alam City. UMKM in Pagar Alam City have experienced an increase in number and various types of business fields. Relevant parties at the Department of Industry, Trade and UKM Cooperatives in Pagar Alam City collected 2 times UMKM data in 1 year which was collected directly through face-to-face meetings with respondents consisting of MSME actors throughout Pagar Alam City. The latest data for UMKM in Pagar Alam City in 2022 has a total of 2,906 UMKMs from 5 sub-districts. The UMKM data collected is currently used as accurate reference data in making government policies to develop Micro, Small and Medium Enterprises (UMKMs) in Pagar Alam City. Based on the data obtained, there are various types of business sectors for UMKM in Pagar Alam City in several sub-districts, from this data collection there is no further management which can hinder the development and improvement of UMKM in Pagar Alam City. Meanwhile, this data is considered necessary for agencies to develop and improve Pagar Alam City UMKMs. Apart from agencies, this data is also useful for sub-districts, sub-districts and RT/RW to be able to find out what interests, talents and potential the community has in the MSME business sector, so that it can be used as a strategy to improve and develop UMKMs in order to support the economy of the actors. the UMKM business. Therefore, there is a need for further data management so that it can be used as policy decisions in improving and developing MSMEs. With the large number of UMKM data and the number of subdistricts, it is necessary to group the UMKM data in Pagar Alam City to deiteirminei thei high and low leiveils of thei numbeir of UMKMs baseid on sub-districts, and thei groups with thei most UMKM busineiss seictors in Pagar Alam City. Thein, data mining is neieideid to bei proceisseid into useiful information using thei clusteiring meithod using a p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10.32736/sisfokom.v13i2.2090, Copyright ©2024 Submitted : February 12, 2024, Revised : March 1, 2024, Accepted : March 19, 2024, Published : June 15, 2024 193 Jurnal SISFOKOM (Sistem Informasi dan Komputer), Volume 13, Nomor 02, PP 193-199 comparison of thei K-Meians and K-Meidoids algorithms to deiteirminei thei grouping of UMKM data baseid on thei high and low leiveils of thei numbeir of UMKMs in thei subdistrict and thei numbeir of eixisting busineiss fieilds. By grouping data on UMKMs in Pagar Alam City, it can heilp ageincieis to focus on sub-districts which still havei low numbeirs of UMKMs, to providei assistancei in improving and deiveiloping UMKMs in Pagar Alam City. Apart from ageincieis, it is also useiful for sub-districts, sub-districts and RT/RW to focus on thei inteireists, taleints and poteintial of thei community in thei higheist numbeir of busineiss fieilds, to providei assistancei so that theiy can bei furtheir improveid and deiveilopeid, eispeicially saleis reisults, in ordeir to support thei eiconomy of UMKM busineiss actors.. Clusteiring proceiss using rapid mineir application havei a reisult of 53 peirceint for clusteir 1 with a total of 8 data, 40 peirceint of thei data for clusteir 2 and 7 peirceint for clusteir 3. [6] Clusteiring proceiss using Rapidmineir application wit K-Meians and K-meidoids algorithm Baseid on reiseiarch conducteid by [7] with thei titlei "Comparison of thei KMeians Algorithm and thei K-Meidoid Algorithm for Grouping MSMEi (...truncated)


This is a preview of a remote PDF: https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/download/2090/988
Article home page: https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/view/2090/988

sendy ariska, Desi Puspita, Inda Anggraini. Comparison Of K-Means and K-Medoids Algorithm for Clustering Data UMKM in Pagar Alam City, Jurnal Sisfokom (Sistem Informasi dan Komputer), 2024, pp. 193-199,