Using data mining on student behavior and cognitive style data for improving e-learning systems: a case study
International Journal of Computational Intelligence Systems, Vol. 5, No. 3 (June, 2012), 597-610
Using data mining on student behavior and cognitive style data for improving e-learning
systems: a case study
Milos Jovanovic, Milan Vukicevic, Milos Milovanovic, Miroslav Minovic
Faculty of Organizational Sciences, University of Belgrade, Jove Ilica 154
Belgrade, Serbia
E-mail: {milos.jovanovic, milan.vukicevic, milos.milovanovic, miroslav.minovic}@fon.bg.ac.rs
www.bg.ac.rs
Received 6 December 2011
Accepted 15 May 2012
Abstract
In this research we applied classification models for prediction of students’ performance, and cluster models for
grouping students based on their cognitive styles in e-learning environment. Classification models described in this
paper should help: teachers, students and business people, for early engaging with students who are likely to
become excellent on a selected topic. Clustering students based on cognitive styles and their overall performance
should enable better adaption of the learning materials with respect to their learning styles. The approach is tested
using well-established data mining algorithms, and evaluated by several evaluation measures. Model building
process included data preprocessing, parameter optimization and attribute selection steps, which enhanced the
overall performance. Additionally we propose a Moodle module that allows automatic extraction of data needed for
educational data mining analysis and deploys models developed in this study.
Keywords: educational data mining, prediction, students, performance, classification, clustering, Moodle.
1. Introduction
Moodle is an open source Learning Management
System (LMS) that is mostly regarded as Course
Management System by the open community. It is
dominantly used in higher education and it has proven
as a successful tool in that setting.1,2 For that reason our
faculty built a distance learning system (DLS) based on
Moodle LMS. The system was built and developed as
an in-house solution at University of Belgrade for the
students of Information technology. One of the main
requirements was to completely support distance
learning process in all its aspects. The system enables
dealing with advanced courses, which use multimedia
lessons, advanced workshops and face to face
communication through video conferencing.
Web-based learning management systems are
extensively used nowadays and produce vast amounts of
data that are potentially useful for improving
educational process.2,4,5 The new emerging field, called
Educational Data Mining (EDM), concerns with
developing methods that discover knowledge from data
originating from educational (traditional or distance
learning) environments.6 Increasing research interests in
using data mining in education is recorded in the last
decade7,8,9,10,11 with focus on different aspects of
educational process (e.g. students, teachers, teaching
materials, organization of classes etc.).
Benefits from extracting knowledge from e-learning
data are expected under assumption that the trails of
user actions can be used to identify specific information
on users. We hope that the user behavior captured in log
files and recorded in data structures can be used to
create models that predict user behavior, or describe
their peculiarities. There are several groups of people
who can leverage this knowledge, and are potential
stakeholders: Students, Teachers, e-learning system
administrators, University management.
These stakeholders could use this knowledge for
different goals9:
Published by Atlantis Press
Copyright: the authors
597
Milos Jovanovic, Milan Vukicevic , Milos Milovanovic, Miroslav Minovic
1. Applications dealing with the assessment of students’
learning performance.
2. Applications that provide course adaptation and
learning recommendations based on the students’
learning behavior.
3. Approaches dealing with the evaluation of learning
material and educational web based courses.
4. Applications that involve feedback to both teachers
and students of e-learning courses, based on the
students’ learning behavior.
5. Developments for the detection of atypical students’
learning behavior.
These goals are achieved with help of data mining
techniques such as k-nearest neighbor, naive Bayes,
decision trees, artificial neural networks, support vector
machines, K-means, hierarchical clustering etc.12
Still, learning management systems are not primarily
designed with data analysis and mining in mind,
because usage data is not stored in a systematic way. Its
thorough analysis requires long and tedious preprocessing.13 Furthermore, LMS systems usually
produce statistic reports. These reports however do not
assist instructors in drawing out useful conclusions
either for the course potential or student abilities and are
useful only for platform administrative purposes.2
This research shows how one can leverage the available
data on student behavior, in order to predict success of
students, as well as profile students into groups which
may help improve existing learning material and
collaborative learning. The study involves data from
students attending online (distance learning) university
courses as suggested by Romero et al.,6 and extends
available data with students cognitive styles.
Additionally we propose Moodle module that allows
automatic extraction of data needed for EDM analysis
and deploys models evolved in this study.
The paper is structured as follows: Section 2 introduces
related work on using e-learning data and applying data
mining models. Architectural design of the decisionsupport system is given in Section 3, with experimental
results in using data mining models presented in Section
4. Potential ways of using knowledge gained by data
mining models is described in Section 5, and Section 6
discusses open issues and related problems for these
types of applications.
2. Background
Romero and Ventura gave a systematic survey about
EDM from 1995 to 2005.10 Because of increasing
popularity and number of researches in this area, the
same authors gave an extensive overview about the state
of the art in this area until 2011 with over 300
references.12 In this paper we will focus on researches
that are closest to our work. Study by Wang and Liao
was performed in order to investigate how Data Mining
techniques can be successfully used for adaptive
learning.14 In academic institutions, Moodle platform is
often utilized as a significant part of e-learning systems.
Romero et al. described how different data mining
techniques can be used in that setting to improve the
course and the students’ learning.6
Applications or tasks that have been resolved through
data mining techniques are classified by Romero and
Ventura in twelve categories: Analysis and visualization
of data, Providing feedback for supporting instructors,
Recommendations for students, Predict students’
performance, Student modeling, Detecting undesirable
student behaviors, Grouping students S (...truncated)