Machine learning approaches for predicting student dropout

  • Tran Thanh Nam
  • Nguyen Van Linh
  • Nguyen Anh Duy
  • Ngo Ho Anh Khoi
Keywords: AI, classification, machine learning, predict student dropout dataset

Abstract

The issue of student dropout during their educational journey is a growing concern with far-reaching implications. This problem not only affects students and their families directly but also poses significant challenges to higher education institutions and society at large. The ramifications for students include increased difficulties and a deficiency in soft skills and life experience, often leading them to seek part-time employment. This study focuses on developing a machine learning model through the process of analyzing, comparing, and evaluating the performance of five models: AdaBoost, DecisionTree, RandomForest, ExtraTree, and BernoulliNB. All models are implemented using the "Predict Student Dropout Dataset." Based on the results obtained after processing the data, the study will conduct an analysis based on two main criteria: evaluation by average percentage, standard deviation, and final outcomes, as well as evaluation using a time-series model of age (Balanced Accuracy Progression). From these analyses, the model with the optimal performance will be selected. By identifying the underlying causes and addressing these issues effectively, the research aims to reduce the burden on families and society, mitigate social problems, stimulate economic growth, generate job opportunities, and enhance both competitiveness and productivity. This dataset is of substantial value for researchers conducting comparative studies on student academic performance and serves as a crucial resource for training in the field of machine learning.

điểm /   đánh giá
Published
2025-02-10