E3S Web Conf.
Volume 214, 20202020 International Conference on Energy Big Data and Low-carbon Development Management (EBLDM 2020)
|Number of page(s)||16|
|Section||Big Data Analysis Application and Energy Consumption Research|
|Published online||07 December 2020|
Data-Driven Approach for Predicting and Explaining the Risk of Long-Term Unemployment
Geelong Grammar School, Australia
Long-term unemployment has significant societal impact and is of particular concerns for policymakers with regard to economic growth and public finances. This paper constructs advanced ensemble machine learning models to predict citizens’ risks of becoming long-term unemployed using data collected from European public authorities for employment service. The proposed model achieves 81.2% accuracy on identifying citizens with high risks of long-term unemployment. This paper also examines how to dissect black-box machine learning models by offering explanations at both a local and global level using SHAP, a state-of-the-art model-agnostic approach to explain factors that contribute to long-term unemployment. Lastly, this paper addresses an under-explored question when applying machine learning in the public domain, that is, the inherent bias in model predictions. The results show that popular models such as gradient boosted trees may produce unfair predictions against senior age groups and immigrants. Overall, this paper sheds light on the recent increasing shift for governments to adopt machine learning models to profile and prioritize employment resources to reduce the detrimental effects of long-term unemployment and improve public welfare.
© The Authors, published by EDP Sciences, 2020
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.