Towards Final Scores Prediction over Clickstream Using Machine Learning Methods

Authors

  • Mohammad Nehal HASNINE Academic Center for Computing and Media Studies, Kyoto University, Japan Author
  • Gokhan AKCAPINAR Department of Computer Education & Instructional Technology, Hacettepe University, Turkey Author
  • Brendan FLANAGAN Academic Center for Computing and Media Studies, Kyoto University, Japan Author
  • Rwitajit MAJUMDAR Academic Center for Computing and Media Studies, Kyoto University, Japan Author
  • Kousuke MOURI Institute of Engineering, Tokyo University of Agriculture and Technology, Japan Author
  • Hiroaki OGATA Academic Center for Computing and Media Studies, Kyoto University, Japan Author

Abstract

E-books are capable of producing a significant amount of clickstream data that insights students’ learning behavior. Clickstream data are often analyzed in learning analytics and educational data mining domains to understand students’ synchronous and asynchronous learning processes. The present study analyzed a dataset consisting of university students’ clickstream data for predicting their final scores using machine-learning methods. To begin with, the raw data are preprocessed in four steps, namely data aggregation, feature generation, data balancing, and feature selection. After that, utilizing machine learning methods, high performing and low performing students’ final scores are predicted. For this, eight machine-learning methods (Neural Network, AdaBoost, Logistic Regression; Naïve Bayes, kNN, Support Vector Machine, Random Forest, and CN2 Rule Induction) are employed and their performances were compared. Result revealed that CN2 Rule Induction algorithm having 88% accuracy outperformed other machine learning methods when best-5 selected features from the dataset were taken into consideration. However, the Multilayer Perceptron based Neural Network performed best having the similar accuracy with CN2 Rule Induction when all features were considered to predict. This paper also focuses on how SMOTE as a data balancing algorithm can be applied to solve data imbalance problem and various scoring methods can be compared to identify the most important feature attributes in clickstream.

Downloads

Download data is not yet available.

Downloads

Published

2018-11-26

How to Cite

Towards Final Scores Prediction over Clickstream Using Machine Learning Methods. (2018). International Conference on Computers in Education. http://library.apsce.net/index.php/ICCE/article/view/3802