Đánh giá tác động của việc điều chỉnh siêu tham số đối với hiệu năng của các mô hình học máy truyền thống

Ho Le Viet Nin; Phan Long; Pham Phu Khuong; Ngo Van Hieu; Nguyen Tan Quoc; Trinh Quang Tin

Ho Le Viet Nin
Phan Long
Pham Phu Khuong
Ngo Van Hieu
Nguyen Tan Quoc
Trinh Quang Tin

Keywords: AST;, Bayesian optimization;, hyperparameter adjustment;, traditional machine learning;, XGBoost

Abstract

In machine learning, selecting an appropriate classification model along with a suitable hyperparameter adjustment strategy plays a crucial role in enhancing predictive performance. This paper presents an experimental framework to evaluate the performance of three traditional machine learning models: SVM, Random Forest and XGBoost, combined with three hyperparameter adjustment strategies: Grid Search, Random Search, and Bayesian optimization. The input data is represented using three types of features: TF-IDF, AST and their combination. Each model configuration is trained 50b times to ensure statistical reliability. Model performance is evaluated based on two key metrics: F1-score and ROC AUC. Experiental results show that the XGBoost model, when using combined features and optimized via Bayesian optimization, achieves the highest performance with an F1-score of 92.0% and a ROC AUC of 94.7%, representing respective improvements of 2.1% and 1.3% over the default settings. A detailed analysis reveals a strong relationship between feature representation, classification algorithm, and adjustment strategy, providing practical insights for selecting machine learning models in classification tasks in practical applications.

An Evaluation of the Impact of Hyperparameter Adjustment on the Performance of Traditional Machine Learning Models

Abstract

BỘ KHOA HỌC VÀ CÔNG NGHỆ - MINISTRY OF SCIENCE AND TECHNOLOGY OF VIETNAM

CỤC THÔNG TIN, THỐNG KÊ - NATIONAL AGENCY FOR SCIENCE AND TECHNOLOGY INFORMATION AND STATISTICS