Efﬁcient Feature Embeddings for Student Classiﬁcation with Variational Auto-encoders
- Conference Paper
Gathering labeled data in educational data mining (EDM) is a time and cost intensive task. However, the amount of available training data directly influences the quality of predictive models. Unlabeled data, on the other hand, is readily available in high volumes from intelligent tutoring systems and massive open online courses. In this paper, we present a semi-supervised classification pipeline that makes effective use of this unlabeled data to significantly improve model quality. We employ deep variational auto-encoders to learn efficient feature embeddings that improve the performance for standard classifiers by up to 28% compared to completely supervised training. Further, we demonstrate on two independent data sets that our method outperforms previous methods for finding efficient feature embeddings and generalizes better to imbalanced data sets compared to expert features. Our method is data independent and classifier-agnostic, and hence provides the ability to improve performance on a variety of classification tasks in EDM Show more
Book titleProceedings of the 10th International Conference on Educational Data Mining.
Pages / Article No.
SubjectEducational data mining; Semi-supervised learning; Deep Neural Networks
Organisational unit03420 - Gross, Markus
MoreShow all metadata