Towards Final Scores Prediction over Clickstream Using Machine Learning Methods
Abstract
E-books are capable of producing a significant amount of clickstream data that insights students’ learning behavior. Clickstream data are often analyzed in learning analytics and educational data mining domains to understand students’ synchronous and asynchronous learning processes. The present study analyzed a dataset consisting of university students’ clickstream data for predicting their final scores using machine-learning methods. To begin with, the raw data are preprocessed in four steps, namely data aggregation, feature generation, data balancing, and feature selection. After that, utilizing machine learning methods, high performing and low performing students’ final scores are predicted. For this, eight machine-learning methods (Neural Network, AdaBoost, Logistic Regression; Naïve Bayes, kNN, Support Vector Machine, Random Forest, and CN2 Rule Induction) are employed and their performances were compared. Result revealed that CN2 Rule Induction algorithm having 88% accuracy outperformed other machine learning methods when best-5 selected features from the dataset were taken into consideration. However, the Multilayer Perceptron based Neural Network performed best having the similar accuracy with CN2 Rule Induction when all features were considered to predict. This paper also focuses on how SMOTE as a data balancing algorithm can be applied to solve data imbalance problem and various scoring methods can be compared to identify the most important feature attributes in clickstream.Downloads
Download data is not yet available.
Downloads
Published
2018-11-26
Conference Proceedings Volume
Section
Articles
How to Cite
Towards Final Scores Prediction over Clickstream Using Machine Learning Methods. (2018). International Conference on Computers in Education. http://library.apsce.net/index.php/ICCE/article/view/3802