An Evaluation of the Impact of Hyperparameter Adjustment on the Performance of Traditional Machine Learning Models
Abstract
In machine learning, selecting an appropriate classification model along with a suitable hyperparameter adjustment strategy plays a crucial role in enhancing predictive performance. This paper presents an experimental framework to evaluate the performance of three traditional machine learning models: SVM, Random Forest and XGBoost, combined with three hyperparameter adjustment strategies: Grid Search, Random Search, and Bayesian optimization. The input data is represented using three types of features: TF-IDF, AST and their combination. Each model configuration is trained 50b times to ensure statistical reliability. Model performance is evaluated based on two key metrics: F1-score and ROC AUC. Experiental results show that the XGBoost model, when using combined features and optimized via Bayesian optimization, achieves the highest performance with an F1-score of 92.0% and a ROC AUC of 94.7%, representing respective improvements of 2.1% and 1.3% over the default settings. A detailed analysis reveals a strong relationship between feature representation, classification algorithm, and adjustment strategy, providing practical insights for selecting machine learning models in classification tasks in practical applications.