Evaluating Random Forest Algorithm: Detection of Palm Oil Leaf Disease

Jan 2025

This research investigates the application of machine learning techniques for detecting diseases in oil palm leaves, utilizing a dataset of 1,119 images sourced from plantations in the Tanah Laut district. The dataset comprises 488 diseased and 631 healthy leaf samples, which were carefully cropped to isolate leaf areas and labeled with the assistance of domain experts. For feature extraction, both Lab and RGB color spaces were considered, alongside Haralick texture features, resulting in a total of eleven features per pixel. To reduce dimensionality and select relevant features, Principal Component Analysis (PCA) and Random Forest methods were applied. Support Vector Machine (SVM) was subsequently employed for the classification of leaf health status, and model performance was evaluated using accuracy, precision, recall, and F1 score metrics, all derived from a confusion matrix. The study finds that PCA and Random Forest significantly enhance model performance, improving the ability to distinguish between healthy and diseased leaves. These findings provide valuable insights for the development of automated disease detection systems in oil palm plantations, with potential applications in precision agriculture. Additionally, the results suggest pathways for further research into plant disease diagnostics, highlighting the role of advanced machine learning techniques in enhancing crop management and supporting sustainable agricultural practices.

Article PDF cannot be displayed. You can download it here:

https://jurnal.itscience.org/index.php/brilliance/article/download/4798/4050

Evaluating Random Forest Algorithm: Detection of Palm Oil Leaf Disease

E-ISSN : 2807-9035 Volume 4, Number 2, November 2024 https://doi.org/10.47709/brilliance.v4i2.4798 Evaluation Random Forest Algorithm: Detection of Palm Oil Leaf Disease Oky Rahmanto1*, Veri Julianto2, Ahmad Rusadi Arrahimi3 1,2,3 Politeknik Negeri Tanah Laut, Indonesia , , 1 *Corresponding Author Article History: Submitted: 08-10-2024 Accepted: 21-01-2025 Published: 23-01-2025 Keywords: Random Forest, PCA, Leaf, Palm Oil Brilliance: Research of Artificial Intelligence is licensed under a Creative Commons AttributionNonCommercial 4.0 International (CC BYNC 4.0). ABSTRACT This research investigates the application of machine learning techniques for detecting diseases in oil palm leaves, utilizing a dataset of 1,119 images sourced from plantations in the Tanah Laut district. The dataset comprises 488 diseased and 631 healthy leaf samples, which were carefully cropped to isolate leaf areas and labeled with the assistance of domain experts. For feature extraction, both Lab and RGB color spaces were considered, alongside Haralick texture features, resulting in a total of eleven features per pixel. To reduce dimensionality and select relevant features, Principal Component Analysis (PCA) and Random Forest methods were applied. Support Vector Machine (SVM) was subsequently employed for the classification of leaf health status, and model performance was evaluated using accuracy, precision, recall, and F1 score metrics, all derived from a confusion matrix. The study finds that PCA and Random Forest significantly enhance model performance, improving the ability to distinguish between healthy and diseased leaves. These findings provide valuable insights for the development of automated disease detection systems in oil palm plantations, with potential applications in precision agriculture. Additionally, the results suggest pathways for further research into plant disease diagnostics, highlighting the role of advanced machine learning techniques in enhancing crop management and supporting sustainable agricultural practices. INTRODUCTION Palm oil plantations play a crucial role in Indonesia's economy. The development of the palm oil industry in the country aims to meet environmental standards that ensure the quality of palm oil production. .(Ichsan, Saputra, & Permatasari, t.t.).. Management of plantations that prioritizes effectiveness and efficiency is key to increasing the productivity and profitability of the palm oil business. This can be achieved through the implementation of precision agriculture concepts that focus on management based on the specific characteristics of each plot of land. (Satia, Firmansyah, & Umami, 2022). Computer vision, as a field of artificial intelligence, has been developed for agricultural applications in image and video processing to assist decision making. It is capable of reasoning, adapting, and self correcting much like humans. The use of computer vision in agriculture is widespread, including applications for monitoring harvests and land (Zheng dkk., 2021), determining fruit ripeness and detecting diseases (Septiarini, Hamdani, Hatta, & Kasim, 2019). In the development stage, palm oil plantations are threatened by pests, diseases and weeds which could decrease plantation’s production(Semangu, 1989). Early detection of palm disease is essential to save the palm trees from excessive damage. It is also necessary to reduce human error in the disease identification process. Technology implementation offers a great benefit in disease identification process. Technology can provide precise analysis so that can reduce the risk of human error(Aji dkk., 2013) and oil palm plant diseases typically manifest themselves on the leaves, resulting in reduced crop quality. It is necessary to solve this issue as the need for premium-quality palm oil keeps growing(Septiarini dkk., 2022). LITERATURE REVIEW Nowadays plant diseases detection has received a lot of attention in monitoring the symptoms at earlier stage of plant growth(Masazhar & Kamal, 2017) In previous research, machine learning methods have been successfully applied to detect diseases in both leaves and fruits (Sharif dkk., 2018). These methods typically involve stages such as preprocessing, segmentation, feature extraction, and classification (Ali, Lali, Nawaz, Sharif, & Saleem, 2017). Additionally, machine learning has also been effective in classifying citrus leaf diseases using local binary pattern (LBP) and color histogram approaches based on RGB and HSV. The classification process was performed using the bagged tree method, which achieved an accuracy of 99.9% using 99 images of infected leaves and 100 images of healthy leaves (Ali dkk., 2017) In another study related to disease classification in grapevines using k-nearest neighbors (KNN), an accuracy of 98.75% was achieved (Saleem, Akhtar, Ahmed, & Qureshi, 2019). A different study focusing on detecting diseases in jackfruit used SVM as the classifier and exponential spider monkey optimization (ESMO) for feature selection, resulting in an accuracy of 90%. Research on plant diseases has also been conducted by detecting diseases in oil palm This is an Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. 919 E-ISSN : 2807-9035 Volume 4, Number 2, November 2024 https://doi.org/10.47709/brilliance.v4i2.4798 leaves using color histograms and supervised classifiers (Hamdani, Septiarini, Sunyoto, Suyanto, & Utaminingrum, 2021). This study employed PCA for feature extraction and ANN for classification, achieving an accuracy of 99.67%. This study will also focus on identifying diseases in oil palm leaves by modifying the feature selection process using Random Forest and comparing it with PCA. The data will then be classified using SVM (Support Vector Machine). Various approaches will be applied during the training and testing phases, and the accuracy will be analyzed using a confusion matrix. There is also research conducted to examine the impact of using PCA in classifying healthy and diseased oil palm leaves. This study showed that if four or more features are used for classification, the accuracy value remains at 97%(Arrahimi, Julianto, & Rahmanto, 2024). METHOD Data Collection and Preparation The research begins with the systematic collection of oil palm leaf samples from plantations in the Tanah Laut district. A total of 1,119 leaves are gathered, representing a diverse range of conditions from healthy to diseased. Each leaf is photographed under controlled conditions to ensure consistency in the data. Once the images are captured, they are meticulously cropped to remove extraneous background elements, focusing solely on the leaf area. This step is crucial for minimizing noise in the dataset and ensuring that the analysis concentrates on relevant features. The resulting images form the foundational dataset, which is crucial for the subsequent steps in the research. This (...truncated)


This is a preview of a remote PDF: https://jurnal.itscience.org/index.php/brilliance/article/download/4798/4050
Article home page: https://jurnal.itscience.org/index.php/brilliance/article/view/4798/4050

Oky Rahmanto, Veri Julianto, Arrahimi Ahmad Rusadi. Evaluating Random Forest Algorithm: Detection of Palm Oil Leaf Disease, 2025, pp. 919-924,