A machine learning model to predict the risk factors causing feelings of burnout and emotional exhaustion amongst nursing staff in South Africa
Van Zyl‑Cillié et al.
BMC Health Services Research
(2024) 24:1665
https://doi.org/10.1186/s12913-024-12184-5
BMC Health Services Research
Open Access
RESEARCH
A machine learning model to predict
the risk factors causing feelings of burnout
and emotional exhaustion amongst nursing
staff in South Africa
Maria Magdalena Van Zyl‑Cillié1,2* , Jacoba H. Bührmann1 , Alwiena J. Blignaut3 , Derya Demirtas2 and
Siedine K. Coetzee3
Abstract
Background The demand for quality healthcare is rising worldwide, and nurses in South Africa are under pres‑
sure to provide care with limited resources. This demanding work environment leads to burnout and exhaustion
among nurses. Understanding the specific factors leading to these issues is critical for adequately supporting nurses
and informing policymakers. Currently, little is known about the unique factors associated with burnout and emo‑
tional exhaustion among nurses in South Africa. Furthermore, whether these factors can be predicted using demo‑
graphic data alone is unclear. Machine learning has recently been proven to solve complex problems and accurately
predict outcomes in medical settings. In this study, supervised machine learning models were developed to identify
the factors that most strongly predict nurses reporting feelings of burnout and experiencing emotional exhaustion.
Methods The PyCaret 3.3 package was used to develop classification machine learning models on 1165 collected
survey responses from nurses across South Africa in medical-surgical units. The models were evaluated on their
accuracy score, Area Under the Curve (AUC) score and confusion matrix performance. Additionally, the accuracy score
of models using demographic data alone was compared to the full survey data models. The features with the high‑
est predictive power were extracted from both the full survey data and demographic data models for comparison.
Descriptive statistical analysis was used to analyse survey data according to the highest predictive factors.
Results The gradient booster classifier (GBC) model had the highest accuracy score for predicting both self-reported
feelings of burnout (75.8%) and emotional exhaustion (76.8%) from full survey data. For demographic data alone,
the accuracy score was 60.4% and 68.5%, respectively, for predicting self-reported feelings of burnout and emotional
exhaustion. Fatigue was the factor with the highest predictive power for self-reported feelings of burnout and emo‑
tional exhaustion. Nursing staff’s confidence in management was the second highest predictor for feelings of burnout
whereas management who listens to employees was the second highest predictor for emotional exhaustion.
Conclusions Supervised machine learning models can accurately predict self-reported feelings of burnout or emo‑
tional exhaustion among nurses in South Africa from full survey data but not from demographic data alone. The
*Correspondence:
Maria Magdalena Van Zyl‑Cillié
Full list of author information is available at the end of the article
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
Van Zyl‑Cillié et al. BMC Health Services Research
(2024) 24:1665
Page 2 of 20
models identified fatigue rating, confidence in management and management who listens to employees as the most
important factors to address to prevent these issues among nurses in South Africa.
Keywords Supervised machine learning model, Nurse burnout, Emotional exhaustion, Maslach Burnout Inventory
Background
With an increasing demand for healthcare services,
nursing staff are pressured to provide quality care with
limited human and material resources. Dubale et al. [1]
explain that the demanding work environment of nurses
has undesirable consequences such as burnout and physical and emotional exhaustion. In fact, the prevalence of
burnout and other mental morbidities is more significant
in the healthcare workforce than among workers in other
settings [2–4]. Burnout is a chronic response to stress in
the workplace, characterized by a physical, mental and
emotional state of exhaustion that reduces the nurse’s
sense of personal and professional fulfillment [5].
Determining factors that cause burnout amongst nursing staff can be done through statistical analysis of survey responses to questionnaires. Researchers such as
Grochowska et al. [6] and Kowalski et al. [7] have done
so effectively by analysing nurses’ responses to their own
questionnaires and the standardized Maslach Burnout
Inventory tool. Although their findings provide insight
into the factors that cause burnout amongst nursing staff,
such analysis is often reactive and after the fact. Specifically, their research focuses on detecting burnout and
the factors that lead to burnout from survey responses
rather than predicting factors that cause burnout [7].
Subsequently, Carvalho Manhães Leite and Wooldridge
[8] call for novel methods to predict the factors that lead
to burnout amongst nursing staff instead of detecting the
factors retrospectively.
To this end, Machine Learning (ML) can be used to
predict burnout and the factors with the highest probability of causing burnout. Bzdok and Krzywinski [9]
state that statistical methods draw population inferences
from samples, whereas ML finds generalisable patterns
that can be used for prediction. Recent studies have also
shown that ML algorithms are able to handle complex
data and have outperformed traditional statistical models
in the prediction of several variables [10–13].
Specifically, supervised ML is a branch of computer
engineering where computational methods use experience in the form of historical data, with provided
outcomes, to train a computer model to make accurate predictions [14]. It has proven to be an effective
technique to handle complex problems in many fields,
including healthcare. Char et al. [15] agree that incorporating ML into clinical medicine studies holds promise
for substantially improving healthcare delivery. Alzu’bi
et al., [16], for example, presented a model to predict
and reduce absenteeism of nurses. Grzadzielewska [17]
emphasized the effectiveness of utilising ML to predict
burnout and Havaei et al. [18] (...truncated)