ProtAttn-QuadNet: An attention-based deep learning framework for protein–protein interaction prediction using ProtBERT embeddings
RESEARCH ARTICLE
ProtAttn-QuadNet: An attention-based deep
learning framework for protein–protein
interaction prediction using ProtBERT
embeddings
Md. Shahidul Islam *, Md. Muhtasim Rahman Mim , Md. Raihan Kabir
Department of Computer Science and Engineering, University of Asia Pacific, Dhaka, Bangladesh
*
Abstract
OPEN ACCESS
Citation: Islam MS, Mim MMR, Kabir MR
(2026) ProtAttn-QuadNet: An attention-based
deep learning framework for protein–protein
interaction prediction using ProtBERT
embeddings. PLoS One 21(6): e0349433.
https://doi.org/10.1371/journal.pone.0349433
Editor: Musa Aydin, Samsun University:
Samsun Universitesi, TÜRKIYE
Received: November 26, 2025
Accepted: April 30, 2026
Published: June 2, 2026
Peer Review History: PLOS recognizes the
benefits of transparency in the peer review
process; therefore, we enable the publication
of all of the content of peer review and
author responses alongside final, published
articles. The editorial history of this article is
available here: https://doi.org/10.1371/journal.
pone.0349433
Copyright: © 2026 Islam et al. This is an open
access article distributed under the terms of
the Creative Commons Attribution License,
which permits unrestricted use, distribution,
Protein–protein interactions (PPIs) form the backbone of most cellular processes,
governing signal transduction, gene regulation, and metabolic control. However,
experimental approaches to identifying PPIs remain expensive, laborious, and often
incomplete. Recent advances in protein language models (PLMs) have transformed
sequence-based PPI prediction by enabling deep contextual encoding of biochemical and structural information directly from amino acid sequences. Building upon this
progress, we present ProtAttn-QuadNet, an attention-based deep learning framework
that leverages ProtBERT embeddings to model reciprocal dependencies between
protein pairs. The proposed model employs a quad-stream attention mechanism
that integrates individual protein features, synergistic interactions, and complementary differences through multi-level self- and cross-attention layers. This architecture
enables the discovery of fine-grained relational patterns while ensuring balanced
bidirectional modeling of interacting proteins. Evaluated on the independent test set
of a large-scale dataset from UniProt, ProtAttn-QuadNet achieves 97.16% accuracy
(AUC-ROC 99.00%) on balanced data and 99.19% accuracy (AUC-ROC 99.76%)
on oversampled datasets, surpassing several recent state-of-the-art PPI prediction
methods. Statistical validation using the Chi-square and Wilcoxon signed-rank tests
confirms the model’s predictive significance and reliability. ProtAttn-QuadNet offers a
powerful computational framework for large-scale PPI prediction.
Introduction
Protein–protein interactions (PPIs) are fundamental to almost all cellular processes, including signal transduction, gene expression regulation, metabolic control, and immune responses [1–3]. Understanding the complex network of PPIs
provides valuable insights into cellular functions and disease mechanisms [4,5].
Although numerous experimental techniques, such as yeast two-hybrid screening,
PLOS One | https://doi.org/10.1371/journal.pone.0349433 June 2, 2026
1 / 16
and reproduction in any medium, provided the
original author and source are credited.
Data availability statement: The primary
data are available from UniProt (https://www.
uniprot.org/uniprotkb?query=reviewed:true). All
reviewed (Swiss-Prot) entries from UniProtKB
were used in this study, comprising 573,661
protein sequences. The processed data and
code supporting this study are publicly available on Figshare and can be accessed through
the following link: https://doi.org/10.6084/
m9.figshare.30637145.
Funding: The author(s) received no specific
funding for this work.
Competing interests: The authors have
declared that no competing interests exist.
co-immunoprecipitation, and affinity purification coupled with mass spectrometry,
have been developed to detect PPIs, these methods remain time-consuming, costly,
and often limited in coverage [6]. Consequently, computational prediction methods
have become indispensable for large-scale PPI analysis.
Early computational approaches primarily relied on handcrafted sequence features, including amino acid composition, evolutionary profiles, and physicochemical
descriptors. Classical machine learning algorithms such as Support Vector Machines
(SVM), Random Forests (RF), and Bayesian classifiers were employed to classify
interacting protein pairs based on these features [7–12]. While these models demonstrated moderate success, their dependence on manually engineered descriptors and
incomplete structural data limited their generalization capabilities, particularly across
species and diverse protein families [11,13–15].
The increasing availability of large-scale protein sequence databases has encouraged sequence-based prediction methods that rely less on structural information.
Deep learning has substantially advanced this field by enabling hierarchical feature
extraction and representation learning. DeepPPI [16] used a fully connected neural network to model complex non-linear relationships between protein features,
whereas DPPI [17] applied a Siamese-like convolutional architecture to learn symmetric relationships between interacting proteins. Similarly, PIPR [18] introduced a
residual recurrent convolutional neural network (RCNN) to capture both local motifs
and long-range dependencies, while Wu et al. proposed DL-PPI [19], a graph neural
network–based model that integrates multi-scale features and attention mechanisms
to enhance relational reasoning among proteins. These architectures collectively
improved predictive performance but often struggled with interpretability, data imbalance, and computational efficiency.
Recent advances in transformer architectures and PLMs have transformed
sequence-based PPI prediction by learning contextualized residue representations
through self-attention mechanisms. Pretrained models such as ProtTrans [5], ProtBERT [20], and ESM-2 [21] encode rich biochemical and evolutionary information
from massive unlabeled protein corpora, effectively capturing secondary and tertiary
structure tendencies directly from primary sequences. Several recent studies have
leveraged these embeddings for PPI prediction using hybrid deep architectures. For
example, xCAPT5 [22] integrated ProtTrans embeddings with a multi-kernel convolutional network to capture local and global dependencies, while TUnA [23] incorporated uncertainty modeling within a transformer framework to improve robustness.
PPI-Graphomer [24] combined pretrained language models with graph transformers
to integrate sequence and structural representations, achieving high performance
across benchmark datasets.
Despite these advances, existing frameworks often treat protein pairs asymmetrically and fail to explicitly model the reciprocal dependencies inhe (...truncated)