Applying clustering methods to classify customers based on shopping behaviour
Tóm tắt
Customer segmentation is crucial for optimizing marketing strategies. This study applies and compares the effectiveness of three common clustering algorithms: K-Means, Hierarchical Clustering, and Gaussian Mixture Models (GMM) to classify customers based on shopping behavior and demographics (age, gender, total spending). Utilizing three retail datasets (two from Kaggle, one from Sling Academy), the research performs data preprocessing, applies the clustering algorithms, and evaluates their performance using Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Index. The results indicate that GMM performs most effectively for segmenting based on total spending and gender, creating distinct clusters. Hierarchical Clustering proves suitable for detailed age-based analysis on specific datasets, while K-Means offers a balanced solution, particularly effective when cluster structures are clear or rapid results are needed. The study recommends selecting appropriate algorithms based on specific business objectives and data characteristics, enabling businesses to develop more effective personalized marketing strategies.