A process-guided uncertainty-aware deep learning framework for reliable and interpretable industrial fault diagnosis
RESEARCH ARTICLE
A process-guided uncertainty-aware deep
learning framework for reliable and interpretable
industrial fault diagnosis
Babar Hayat 1, Shabeer Ahmad2, Muhammad Asfandyar Shahid3, Adil Khan1,
Md. Rajibul Islam 4, Md Shohel Sayeed 5*, Yasir Ullah6
1 School of Information Engineering, Xi’an Eurasia University, Xi’an, Shaanxi, China, 2 School of
Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, China, 3 School
of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China,
4 Department of Computer Science and Engineering, Bangladesh University of Business and Technology,
Dhaka, Bangladesh, 5 Centre for Intelligent Cloud Computing, CoE for Advanced Cloud, Faculty of
Information Science and Technology, Multimedia University, Bukit Beruang, Melaka, Malaysia, 6 Centre for
Wireless Technology, Faculty of AI and Engineering, Multimedia University, Cyberjaya, Selangor, Malaysia
*
Abstract
OPEN ACCESS
Citation: Hayat B, Ahmad S, Shahid MA,
Khan A, Islam MR, Sayeed MS, et al. (2026)
A process-guided uncertainty-aware deep
learning framework for reliable and interpretable
industrial fault diagnosis. PLoS One 21(6):
e0349385. https://doi.org/10.1371/journal.
pone.0349385
Editor: Muhammad Shahid Anwar, Gachon
University, KOREA, REPUBLIC OF
Received: December 22, 2025
Accepted: April 29, 2026
Published: June 2, 2026
Copyright: © 2026 Hayat et al. This is an open
access article distributed under the terms of
the Creative Commons Attribution License,
which permits unrestricted use, distribution,
and reproduction in any medium, provided the
original author and source are credited.
Data availability statement: All relevant data
are within the manuscript and its Supporting
information files.
Funding: The author(s) received no specific
funding for this work.
Timely fault detection is essential for safety, product quality, and energy efficiency
in advanced industrial processes. However, many existing fault diagnosis methods insufficiently exploit process structure and sensor reliability, which limits their
robustness and practical usefulness for process engineers. This study presents an
improved framework SAU-PGA-CNN-BiLSTM that first couples Convolutional Neural
Networks and Bidirectional Long Short-Term Memory layers to extract multivariate
temporal dynamics and spatial correlations of the process data, secondly a process
guided and sensor-aware attention mechanism is introduced which embeds process
centrality, sequence level sensor reliability and uncertainty to the attention learning,
to suppress unreliable channels and bias towards informative and stable sensors. In
addition, Monte Carlo dropout with sensor prior-conditioning is used to provide
calibrated confidence estimates that reflect both predictive uncertainty and sensor
reliability. Finally, two lightweight sigmoid output heads perform fault detection and
diagnosis combinedly, promoting mutual reinforcement between the tasks. Validated
on the Tennessee Eastman Process benchmark, the proposed framework outperforms baselines model and achieves 93.6% multiclass diagnosis accuracy with
94.0% F1 score. After temperature scaling, the proposed model also demonstrates
improved calibration compared with an otherwise identical model without sensor
awareness, reducing negative log-likelihood from 0.197 to 0.182, Brier score from
0.101 to 0.095, and expected calibration error from 0.040 to 0.037. Attention visualizations further show that the model focuses on process-relevant and reliable sensors, supporting reliable industrial fault diagnosis.
PLOS One | https://doi.org/10.1371/journal.pone.0349385 June 2, 2026
1 / 25
Competing interests: The authors have
declared that no competing interests exist.
1. Introduction
Detecting and diagnosing abnormal events quickly in large industrial processes is
crucial for safety, maintaining product quality, and improving energy efficiency [1].As
modern industrial processes use more and more dense sensor networks, they generate and store huge amounts of process data every day, creating a unique chance for
intelligent monitoring [2]. However, the complex, nonlinear, and constantly changing
nature of industrial systems makes fault detection and diagnosis (FDD) quite challenging [3]. Faults that are missed or diagnosed may propagate at a very fast rate,
resulting in damaged equipment, environmental risks, and significant financial loss
[4]. The common process monitoring methods which have been used extensively
in fault detection and diagnosis (FDD) of industrial systems include Principal Component Analysis (PCA) Partial least squares (PLS) and Multivariate Statistical Process Control (MSPC) [5–7]. Data-driven techniques, including multivariate statistical
process monitoring and machine learning methods, have therefore gained substantial
attention due to their flexibility and reduced dependence on explicit process models. However, classical methods such as PCA, PLS, and shallow classifiers typically
assume linear relationships and lack the capacity to capture complex temporal
dependencies and fault propagation patterns [8,9]. Moreover, such procedures do not
provide a practical understanding of the underlying causes of failures, lowering the
usefulness of the operator intervention and troubleshooting [3]. The recent surging
popularity of deep learning technologies has spawned advanced dataintensive models of monitoring and diagnosis of industrial processes. CNNs are also
skilled at describing local spatial correlations of sensors, whereas Recurrent Neural
Networks (RNNs), especially LSTM units are skilled at describing temporal dependencies [10,11]. Hybrid models that combine CNNs with LSTMs or BiLSTMs have
shown notable improvements in accuracy on benchmark datasets like the Tennessee
Eastman Process (TEP). Nevertheless, more powerful deep learning models are
also costly, in the cost of training and running them, which makes them impractical
to apply in industrial contexts with highly constrained time, latency, and hardware
requirements [12,13].
More importantly, the vast majority of deep neural networks remain black boxes,
which do not provide much interpretability to process engineers [14]. This transparency deficiency makes automated monitoring system implementation challenging as
operators require beyond alerts, they want to know what variables and time periods
are behind the observed anomalies [15]. Attention mechanism has been proposed as
a promising technique that can provide interpretability, i.e., make the internal workings of neural networks transparent by giving priority ratings to input elements and
time steps [16]. However, process monitoring models that use attention are often
based on multi-head attention or transformer designs, which add complexity and
computational cost to the models [17]. Beyond predictive accuracy and interpretability, practical industrial fault detection and diagnosis (FDD (...truncated)