Grammatical Error Detection with Limited Training Data: The Case of Chinese
Abstract
In this paper, we describe the UDS submission to the shared task on Grammatical Error Diagnosis for Learning Chinese as a Foreign Language. We designed four different experiments (runs) to approach this task. All of them are variations of a frequency-based approach using a journalistic corpus as standard corpus and comparing n-gram frequency lists to both the training and the test corpus provided by the shared task organizers. The assumption behind this approach is that comparing a standard reference corpus to a non-standard study corpus using frequency-based methods levels out non-standard features present in the study corpus. These features are very likely to be, in the case of this corpus, grammatical errors. Our system obtained 60.3% f-measure at the error detection level and 25.3% f-measure at the error diagnosis level.Downloads
Download data is not yet available.
Downloads
Published
2014-11-30
Conference Proceedings Volume
Section
Articles
How to Cite
Grammatical Error Detection with Limited Training Data: The Case of Chinese. (2014). International Conference on Computers in Education. http://library.apsce.net/index.php/ICCE/article/view/3072