Data and Process Mining in Analysing Student Behaviour
Interdisciplinary Description of Complex Systems 23(5), 467-483, 2025
DATA AND PROCESS MINING IN
ANALYSING STUDENT BEHAVIOUR
Snježana Križanić*, Katarina Tomičić-Pupek and Neven Vrček
University of Zagreb, Faculty of organization and informatics
Varaždin, Croatia
DOI: 10.7906/indecs.23.5.4
Regular article
Received: 30 April 2025.
Accepted: 1 September 2025.
ABSTRACT
The diversity of students’ learning paths is crucial for acquiring knowledge. Although there are digital
learning environments that provide many opportunities for managing the learning process, the rapid
development of technologies can cause disruptions in the realisation of targeted engagement scenarios.
Monitoring educational content use and increasing interaction frequency can contribute to better
performance management and achievement of learning outcomes.
Data and process mining methods and tools play a significant role in the research of performance and
deviations. Anonymized real data from one elective university course was collected and processed to
create a dataset for the application of clustering and decision tree analysis in the KNIME Analytics
Platform and for creating a process model in a process mining tool. The results show behavioural
patterns for three clusters and provide insight into interaction types by identifying variables related to
content engagement as effective discriminators for student grouping. The process model illustrates the
diversity of engagement in choosing learning paths through the course (based on 51 cases performing
52 distinct activities with an average of 233 activities), while retaining the focus on the assignment
deliverables. Insights obtained from the analyses are useful for the effective implementation of digital
learning environments illustrating that no exceptional scenarios occurred in the course in terms of
deviations in behaviour with the digital learning platform in relation to similar teaching and learning
paradigms provided by the same teachers and that more interactive features combined with new
technologies would be useful in providing more personalized learning paths.
KEY WORDS
data mining, clustering, decision tree, process mining, educational data
CLASSIFICATION
ACM: H33
JEL:
D83, M00, O31
*Corresponding author, : ; +385 42 390 893;
*FOI, Pavlinska 2, HR – 42 000 Varaždin, Croatia
S. Križanić, K. Tomičić-Pupek and N. Vrček
INTRODUCTION
Commitment to ensure engaged learning and provide feasible learning paths fitting various
behavioural patterns is a desired feature while using digital learning environments. Teachers
strive to develop a predictive model for improving teaching and course content management
strategies based on real data. By using data on students’ interaction and performance, teachers
can discover learning patterns, supporting nearly personalized learning paths for students with
different behavioural types. Data and process mining methods and techniques promise to
contribute to revealing patterns and sequences, creating visualizations and analytics offering to
predict student performance and deviations, expected use-cases, enabling data-driven
education management. The expected aim of a more efficient education management is to
provide insights for all stakeholders in how to design and maintain a satisfying customer
journey (i.e., a student journey through a course) consisting of enough (but not too much)
challenging (but not too hard to pass) activities, through delivering interaction touchpoints and
resources enabling the acquisition of desired learning outcomes.
Recent developments in educational process mining have focused on discovering student
learning paths and analyzing behavioral patterns within digital learning environments [1].
Educational data mining (EDM) and process mining have emerged as powerful analytical
approaches for understanding and improving educational processes. The systematic application
of data mining techniques in educational settings has been extensively documented, with
researchers demonstrating the potential for analyzing student performance patterns and
predicting academic outcomes [2]. The implementation of multimodal learning analytics has
opened new avenues for understanding help-seeking behaviors and student interactions with
both automated systems and human experts [3]. Comprehensive systematic reviews have
confirmed that classification algorithms are most frequently applied in educational settings for
evaluating student academic outcomes and identifying at-risk learners, while clustering
techniques are commonly used for behavioural profiling and dropout prediction [4, 5]. These
reviews emphasize that no single model uniquely predicts student performance, and the
effectiveness of approaches depends heavily on data quality and contextual factors [5].
Although the motives for analysing student behaviour are often diverse, in this case the main
goal was to investigate whether any exceptional scenarios occurred. This was done with
consideration of the disruptive impact of new technologies on teaching and learning, and with
a focus on what could be inferred from the measured engagement levels in one elective course
at a higher education institution. The aim of this study is to explore student behavioural patterns
in a digital learning environment using data and process mining techniques in order to identify
engagement levels and detect potential deviations impacting course design and teaching
strategies.
The article is organized in four main sections as follows. The Literature Review section
identifies previous achievements in the field of data and process mining in education. The
Methodology section describes the research design and procedures employed in this study. In
the Data subsection, the dataset used for the research is presented in detail. Subsequently, the
subsection Clustering and Decision Tree Procedure explains the application of data mining
techniques, specifically clustering and decision tree analysis. Similarly, the Process Mining
subsection describes the process mining approach applied in this study. The results are
discussed in the Results section. The article concludes with the Discussion and Conclusion.
LITERATURE REVIEW
The reviewed literature demonstrates a growing interest in applying data-driven approaches,
particularly clustering, classification, and process mining in order to analyse and improve
468
Data and process mining in analysing student behaviour
educational processes. Numerous studies employed process mining techniques such as process
discovery and conformance checking to uncover students’ learning paths, behavioural patterns,
or system-level process inefficiencies. Clustering methods, especially k-means and
Expectation-Maximization, were frequently used to group students based on their interactions
or performance levels, while classification algorithms like Decision Trees, Naïve Bayes, and
Support Vector Machines were applied to predict academic outcom (...truncated)