Applying Machine Learning in assessing the health of the private economy in Vietnamese localities
Abstract
The private economy is a crucial driving force for growth, job creation, and innovation; however, Vietnam still lacks a systematic measure of the "health" of this sector at the local level. Therefore, this article builds a comprehensive index of private economic health for 63 provinces/cities, serving local clustering and policy orientation. The research method applies machine learning: PCA to reduce dimensionality and extract core dimensions (scale, density, operational efficiency), then K-means to cluster localities. The data used are from the General Statistics Office (2022), covering both quantity and quality indicators. The results identify three clear clusters: (i) Hanoi and Ho Chi Minh City have large scale and relatively comprehensive development; (ii) the group of industrial provinces (Bac Ninh, Bac Giang, Hai Phong, Ba Ria- Vung Tau) has a high density of enterprises, efficient use of capital but overall efficiency is not equal to cluster (i); (iii) mountainous/remote provinces have modest scale and efficiency. The new contribution of this study is to propose a multidimensional, transparent, and reproducible index based on PCA- K-means, thereby providing an empirical basis for designing policy recommendations tailored to local clusters.