Learning Algorithm Implementation Structures for Multilabel Classification via CodeBERT
Abstract
Task constraint feedback is the collective name for any kind of feedback system that checks whether problem-defined constraints were fulfilled by students upon submission of work. This can be as simple as checking if certain programming constructs exist, or if a specific algorithm or data structure required by the problem is fulfilled. Most of these systems use static analysis (Fischer, 2006; Gotel, 2008) or natural language processing techniques (Lane, 2005) to generate feedback. A transformer is a neural network for sequence processing, such as natural languages. Previous work has shown that transformers can be generalized for programming language tasks such as code summarization. In this study, we used the CodeBERT transformer to classify or tag algorithms implemented in some code snippets to check constraint satisfaction. Using a custom dataset containing source code aiming to implement algorithms, we show that CodeBERT is capable of learning structures of how code is implemented regardless of how a programmer names the code. Averaging each label’s f1-score, the model was able to obtain an average of 0.85, which showed promising results in the dataset.Downloads
Download data is not yet available.
Downloads
Published
2022-11-28
Conference Proceedings Volume
Section
Articles
How to Cite
Learning Algorithm Implementation Structures for Multilabel Classification via CodeBERT. (2022). International Conference on Computers in Education. https://library.apsce.net/index.php/ICCE/article/view/4457