Trustworthy Secondary Use of Educational Big Data with ReLEAF

Authors

Abstract

Digital infrastructures such as Learning and Evidence Analytics Framework (LEAF) involve educational big data, whose secondary use offers unprecedented opportunities for advancing research and impact on education. However, prior efforts have struggled to balance privacy protection with societal benefits. To address this issue, we present a novel data sharing system ReLEAF, developed on the LEAF infrastructure. It implements the two-stage data sharing approach: (1) differentially private synthetic data is shared and (2) real-data validation is performed on demand. Since the latter incurs additional privacy loss and operational costs associated with output checking, we conduct a formative user study to explore to what extent synthetic data alone support secondary use and when real-data validation becomes necessary. Results suggest that, while synthetic data alone could support exploratory analysis, validation is perceived as necessary to publish findings or apply to practice, unless the quality of synthetic data is guaranteed. Implications for system improvement and future directions are discussed.

Downloads

Download data is not yet available.

Downloads

Published

2026-06-25

Conference Proceedings Volume

Section

Conference Proceedings Submissions