Data Analysis of Unemployment of Chinese Regions based on Machine Learning

Document Type

Conference Proceeding

Publication Date



Reducing the unemployment rate has become a serious social problem facing the world. In this article, we used the average wages of private sector employees in cities by industry and the entropy of these wages as characteristic variables to analyze the unemployment rate of 30 major provinces and cities in China from 2011 to 2019. We use K-Nearest Neighbors (KNN), Support Vector Machines (SVM), and Adaptive Boosting Algorithms (Adaboost) to classify areas with high and low unemployment rates. Then we perform linear regression analysis based on the results, analyze the correlation between average wages and income inequality, and interpret the classification results according to the decision boundary. In conclusion, we find that in regions with low unemployment rates in China, higher average wages are often accompanied by greater income inequality, while in regions with higher unemployment rates, the situation is more moderate. Compared with areas with low unemployment rates, the increase in average wages in areas with the same high unemployment rate has brought about a smaller increase in income inequality.

Publication Title

Proceedings of SPIE - The International Society for Optical Engineering



This document is currently not available here.