Causal integration in graph neural networks toward enhanced classification: benchmarking and advancements for robust performance (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s11280-025-01343-1.pdf

Causal integration in graph neural networks toward enhanced classification: benchmarking and advancements for robust performance

World Wide Web (2025) 28:30 https://doi.org/10.1007/s11280-025-01343-1 Causal integration in graph neural networks toward enhanced classification: benchmarking and advancements for robust performance Simi Job1 · Xiaohui Tao1 · Taotao Cai1 · Lin Li2 · Quan Z. Sheng3 · Haoran Xie4 · Jianming Yong5 Received: 3 October 2024 / Revised: 3 February 2025 / Accepted: 27 March 2025 © The Author(s) 2025 Abstract The expansion of Graph Neural Networks (GNNs) has highlighted the importance of evaluating their performance in real-world scenarios. However, existing evaluation frameworks often overlook the integration of causality, a critical component that is essential for more robust evaluation of GNNs. To address this gap, we present a benchmark study that systematically compares standard and causal GNN models with a focus on classification tasks. Our analysis encompasses a careful selection of nine GNN models across seven diverse datasets that span three distinct domains. The results reveal the following: I) Causality-enhanced GNNs consistently outperform their traditional counterparts in graph classification tasks; II) Models integrating causal features exhibit greater generalizability across varied datasets; and III) Incorporation of causal elements significantly improves the predictive accuracy of GNNs. These findings highlight the importance of embedding causality in the evaluation and development of GNNs for improved performance and application. Keywords Graph neural networks · GCN · GAT · Causality · Graph classification · GraphSAGE 1 Introduction Graph Neural Networks (GNNs) have emerged as a powerful tool for processing graphstructured data, demonstrating remarkable performance in various tasks such as node B Simi Job 1 School of Mathematics, Physics, and Computing, University of Southern Queensland, Toowoomba, Australia 2 School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, China 3 School of Computing, Macquarie University, Sydney, Australia 4 School of Data Science, Lingnan University, Hong Kong Special Administrative Region, Hong Kong, China 5 School of Business, University of Southern Queensland, Springfield, Australia 0123456789().: V,-vol 123 30 Page 2 of 26 World Wide Web (2025) 28:30 classification [1], link prediction [2] and graph classification [1–3]. GNNs have found applications in various domains including recommendation [4], urban intelligence [5], medicine [6], community detection [7], fraud detection [8] and so on. Despite their success, GNNs face several limitations including over-smoothing, interpretability and generalizability problems, sensitivity to graph structure and limited ability in capturing long-range dependencies. Causality, the understanding of cause and effect relationships among various factors, extends beyond mere correlations. It focuses on comprehending the interactions between elements that result in specific outcomes and explores how changes in one aspect can impact another element within a system. Recently, there has been an increased focus on exploring causality, with researchers acknowledging the importance of incorporating causal knowledge into data modelling. Causality has found numerous applications in several domains including economics [9], social sciences [10], medicine [11] and healthcare [12], environmental science [13], recommendation [14, 15] etc. For instance, in medicine, causality can explore factors that impact treatment outcomes and those that increase the risk of medical conditions. In social sciences, it can reveal the causal factors that contribute to economic inequalities. In recommendation systems, causality can uncover factors that influence user preferences and engagement. In these contexts, causal analysis can uncover factors that influence outcomes, improve model interpretability and enhance predictive accuracy. Integrating causality into GNN architecture can significantly mitigate the aforementioned limitations by prioritizing relevant information, capturing long-range dependencies, and promoting the extraction of transferable features, thereby improving generalizability. By examining inherent causal relationships within the data, it becomes possible to enhance GNN performance and application. In this study, we aim to thoroughly investigate the application of graph neural networks for classification tasks and demonstrate the significance of causally enabled GNNs in identifying true interactions within data. Few studies have systematically benchmarked GNNs with a focus on causal classification. Existing benchmark studies such as [16], which examined graph positional encoding in GNNs, and [17], which explored the use of GNNs for fault diagnosis, have provided foundational insights into these areas. Kosan et al. [18] conducted a benchmark study that focused on GNN explainers, while [19] performed an extensive investigation into deep GNN architectures, experimenting with different model settings across various citation network datasets. All of these studies primarily evaluate GNN performance based on traditional metrics without integrating causal analysis. To address this gap, our research aims to analyse the significance of causality in generalizable graph prediction models. Specifically, we conduct a comprehensive study on the most representative models that are used in graph neural networks classification tasks, with the potential of incorporating causal elements into the respective frameworks. This empirical study contributes to the research community with the following interesting findings: • The attention-based causal model (CAL framework) consistently outperformed baseline GNN models in larger graph classification tasks, demonstrating its ability to capture complex global patterns and dependencies across networks. • Baseline GNN models excelled in smaller node classification tasks, highlighting their efficiency in scenarios with limited data and simpler relationships. • Hyperparameter tuning plays a crucial role in improving model performance, and our research emphasizes the adaptability of causal models to multi-class datasets in graph classification. These findings highlight the importance of embedding causality in the evaluation and development of GNNs for enhanced performance and application. The remainder of the paper is structured as follows: Section 2 provides an overview of research studies centered 123 World Wide Web (2025) 28:30 Page 3 of 26 30 on GNNs and causality. Section 3 outlines the study design. Section 4 presents the results of the empirical study. Finally, Section 5 concludes the paper with a brief summary of our findings. 2 Where GNN meets causality This section reviews existing literature focusing on graph neural networks (GNNs) and their variants, causality and their applications in classification tasks. 2.1 Graph neural networks Graph Neural Networks (GNNs) are designed to process graph-structured data, wher (...truncated)