Uncertainty-aware ensemble of foundation models differentiates glioblastoma from its mimics (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41467-025-64249-6.pdf

Uncertainty-aware ensemble of foundation models differentiates glioblastoma from its mimics

Article https://doi.org/10.1038/s41467-025-64249-6 Uncertainty-aware ensemble of foundation models differentiates glioblastoma from its mimics Received: 23 April 2025 1234567890():,; 1234567890():,; Accepted: 9 September 2025 Check for updates Junhan Zhao 1,2,18, Shih-Yen Lin1,18, Raphaël Attias 1, Liza Mathews1, Christian Engel1, Guillaume Larghero1, Dmytro Vremenko1, Ting-Wan Kao1, Tsung-Hua Lee1, Yu-Hsuan Wang3, Cheng Che Tsai1, Eliana Marostica 1, Ying-Chun Lo 4, David Meredith5, Keith L. Ligon 6, Omar Arnaout7, Thomas Roetzer-Pejrimovsky 8, Shih-Chieh Lin9, Natalie NC Shih10, Nipon Chaisuriya 4,11, David J. Cook 4, Jung-Hsien Chiang 3, Chia-Jen Liu 1,12,13, Adelheid Woehrer 8,14, Jeffrey A. Golden15, MacLean P. Nasrallah 10 & Kun-Hsing Yu 1,5,16,17 Accurate pathological diagnosis is crucial in guiding personalized treatments for patients with central nervous system cancers. Distinguishing glioblastoma and primary central nervous system lymphoma is particularly challenging due to their overlapping pathology features, despite the distinct treatments required. To address this challenge, we establish the Pathology Image Characterization Tool with Uncertainty-aware Rapid Evaluations (PICTURE) system using 2141 pathology slides collected worldwide. PICTURE employs Bayesian inference, deep ensemble, and normalizing ﬂow to account for the uncertainties in its predictions and training set labels. PICTURE accurately diagnoses glioblastoma and primary central nervous system lymphoma with an area under the receiver operating characteristic curve (AUROC) of 0.989, with the results validated in ﬁve independent cohorts (AUROC = 0.924-0.996). In addition, PICTURE identiﬁes samples belonging to 67 types of rare central nervous system cancers that are neither gliomas nor lymphomas. Our approaches provide a generalizable framework for differentiating pathological mimics and enable rapid diagnoses for central nervous system cancer patients. More than 86,000 patients in the U.S. are diagnosed with CNS neoplasms annually, leading to over 16,000 deaths each year1. The 2021 WHO Classiﬁcation of CNS Tumors (WHO CNS5)2 identiﬁes 109 distinct tumor subtypes based on pathology and molecular proﬁles3. Because treatments and prognoses of different CNS tumors vary considerably4–7, obtaining accurate pathological diagnoses is critical. Glioblastoma, the most common brain cancer in the U.S., has a dismal median survival of 8 months1,5, and surgical resection remains the cornerstone of initial treatment7. Notably, previous studies showed that primary central nervous system lymphoma (PCNSL) is the cancer A full list of afﬁliations appears at the end of the paper. Nature Communications | (2025)16:8341 type most frequently misdiagnosed as glioblastoma8–12. This misclassiﬁcation has important clinical implications: patients with PCNSL have a median survival of more than three years following diagnosis and often respond well to radiotherapy4,5. Although patients’ age, immune status, and imaging features from magnetic resonance imaging inﬂuence clinicians’ initial diagnostic assessments, pathology evaluation using tumor samples provides the ﬁnal diagnosis6,7. When PCNSL is diagnosed during surgery with the intent for tumor removal, neurosurgeons will usually discontinue further surgical intervention to preserve neurological function and refer patients for radiotherapy e-mail: 1 Article combined with chemotherapy7–9. In addition, ﬁnal diagnosis using formalin-ﬁxed, parafﬁn-embedded (FFPE) tissue conﬁrms tumor types and guides long-term treatment planning2. Thus, accurate distinction between glioblastoma and PCNSL at both intraoperative and ﬁnal diagnostic stages is therefore essential to avoid unnecessary surgery and ensure timely initiation of appropriate therapy. Several challenges have hindered the accurate pathological diagnosis of CNS neoplasms1,6. The current issue in diagnosing glioblastoma and PCNSL lies in the inherent variability and uncertainty in both frozen section and FFPE evaluations13,14. Intraoperative frozen section diagnostics are invaluable for immediate assessment during brain cancer surgeries. However, prior studies have reported that 9.7% to 46.2% of frozen section diagnoses differ from the ﬁnal FFPE-based diagnoses9–11,15,16. Recent studies have reported an inter-observer disagreement rate of up to 16% in FFPE diagnoses13,14. While the deﬁnitive diagnosis of brain cancers relies on FFPE tissue analysis, which enables thorough evaluations of the morphological patterns observed in CNS neoplasms, these microscopic ﬁndings across cancer types are sometimes distinct and, at other times, overlapping. For example, the glioblastoma pathology is highly variable and shares features with other tumors, including PCNSL. Glioblastomas typically manifest as inﬁltrating hypercellular neoplasms with nuclear pleomorphism, microvascular proliferation, and necrosis with or without surrounding pseudopalisading17. The neoplastic cells may be ﬁbrillary, epithelioid, or round cells, the latter mimicking lymphoma cells. Further complicating diagnoses, PCNSL may also exhibit nuclear pleomorphism, necrosis, increased mitotic activity, and a perivascular propensity that can mimic pseudopalisading18. In addition, the atypia of reactive glia in PCNSL and inﬁltrating lymphoma cells within the brain parenchyma can lead to misinterpretation18,19. Weakly supervised machine learning applied to pathology images has demonstrated the potential to assist cancer cell detection, subtype classiﬁcation, and prognostic prediction20. Nevertheless, current deep learning-based approaches for neuro-oncological diagnostics remain largely conﬁned to radiological applications. Existing pathological diagnostic models focus on differentiating glioma types or applying few-shot learning techniques to rarer subtypes due to the limitation of data availability. In addition, models trained on cohorts without sufﬁcient diversity often experience substantial performance decay when applied to new patient populations due to differences in sample preparation and slide scanning protocols21. Due to the morphological heterogeneity17, previous studies showed substantial variations in AI models’ diagnostic performance for this deadly cancer22. In addition, standard machine learning models inevitably classify any new data points into one of the categories they were trained with, regardless of the nature of the new samples23. These caveats have limited the application of AI models in cancer diagnoses24. In this study, we present the Pathology Image Characterization Tool with Uncertainty-aware Rapid Evaluations (PICTURE). PICTURE leverages epistemic uncertainty quantiﬁcations25,26 to identify atypical pathology manifestations and uses diverse pathology images presented in medical literature to guide the development of selfsupervised deep neural networks. We successfully validate the PICTURE system and show that it signiﬁcantly outper (...truncated)