Educational Data Clustering in a Weighted Feature Space Using Kernel K-Means and Transfer Learning Algorithms

Vo Thi Ngoc Chau; Nguyen Hua Phung

doi:10.25073/2588-1086/vnucsce.172

Vo Thi Ngoc Chau, Nguyen Hua Phung

PDF

Published Mar 11, 2018

DOI: https://doi.org/10.25073/2588-1086/vnucsce.172

How to Cite

CHAU, Vo Thi Ngoc; PHUNG, Nguyen Hua. Educational Data Clustering in a Weighted Feature Space Using Kernel K-Means and Transfer Learning Algorithms. VNU Journal of Science: Computer Science and Communication Engineering, [S.l.], v. 33, n. 2, mar. 2018. ISSN 2588-1086. Available at: <//jcsce.vnu.edu.vn/index.php/jcsce/article/view/172>. Date accessed: 06 july 2025. doi: https://doi.org/10.25073/2588-1086/vnucsce.172.

ABNT APA BibTeX CBE EndNote - EndNote format (Macintosh & Windows) MLA ProCite - RIS format (Macintosh & Windows) RefWorks Reference Manager - RIS format (Windows only) Turabian

Issue

Vol 33 No 2 (2017)

Section

Articles

Abstract

Educational data clustering on the studentsâ€™ data collected with a program can find several groups of the students sharing the similar characteristics in their behaviors and study performance. For some programs, it is not trivial for us to prepare enough data for the clustering task. Data shortage might then influence the effectiveness of the clustering process and thus, true clusters can not be discovered appropriately. On the other hand, there are other programs that have been well examined with much larger data sets available for the task. Therefore, it is wondered if we can exploit the larger data sets from other source programs to enhance the educational data clustering task on the smaller data sets from the target program. Thanks to transfer learning techniques, a transfer-learning-based clustering method is defined with the kernel k-means and spectral feature alignment algorithms in our paper as a solution to the educational data clustering task in such a context. Moreover, our method is optimized within a weighted feature space so that how much contribution of the larger source data sets to the clustering process can be automatically determined. This ability is the novelty of our proposed transfer learning-based clustering solution as compared to those in the existing works. Experimental results on several real data sets have shown that our method consistently outperforms the other methods using many various approaches with both external and internal validations.

Article Sidebar

Article Details

Main Article Content

Abstract