A foundation model for clinical-grade computational pathology and rare cancers detection (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41591-024-03141-0.pdf

A foundation model for clinical-grade computational pathology and rare cancers detection

nature medicine Article https://doi.org/10.1038/s41591-024-03141-0 A foundation model for clinical-grade computational pathology and rare cancers detection Received: 6 February 2024 Accepted: 19 June 2024 Published online: xx xx xxxx Check for updates Eugene Vorontsov 1,6, Alican Bozkurt1,6, Adam Casson1,6, George Shaikovski1,6, Michal Zelechowski1,6, Kristen Severson 2,6, Eric Zimmermann2, James Hall2, Neil Tenenholtz 2, Nicolo Fusi 2, Ellen Yang3, Philippe Mathieu1, Alexander van Eck1, Donghun Lee1, Julian Viret1, Eric Robert 1, Yi Kan Wang1, Jeremy D. Kunz1, Matthew C. H. Lee1, Jan H. Bernhard1, Ran A. Godrich1, Gerard Oakley1, Ewan Millar4, Matthew Hanna3, Hannah Wen3, Juan A. Retamero1, William A. Moye1, Razik Yousfi1, Christopher Kanan 1,5, David S. Klimstra1, Brandon Rothrock 1, Siqi Liu 1 & Thomas J. Fuchs1 The analysis of histopathology images with artificial intelligence aims to enable clinical decision support systems and precision medicine. The success of such applications depends on the ability to model the diverse patterns observed in pathology images. To this end, we present Virchow, the largest foundation model for computational pathology to date. In addition to the evaluation of biomarker prediction and cell identification, we demonstrate that a large foundation model enables pan-cancer detection, achieving 0.95 specimen-level area under the (receiver operating characteristic) curve across nine common and seven rare cancers. Furthermore, we show that with less training data, the pan-cancer detector built on Virchow can achieve similar performance to tissue-specific clinical-grade models in production and outperform them on some rare variants of cancer. Virchow’s performance gains highlight the value of a foundation model and open possibilities for many high-impact applications with limited amounts of labeled training data. Pathologic analysis of tissue is essential for the diagnosis and treatment of cancer. Increasingly, the traditional histological preparations used for light microscopy examination are being replaced by their digital counterparts, also known as whole-slide images (WSIs), which enables the use of computational pathology1–4 to move from primarily academic proof points to routine tools in clinical practice. Computational pathology applies artificial intelligence (AI) to digitized WSIs to support the diagnosis, characterization and understanding of disease5,6. Initial work has focused on clinical decision support tools to enhance current workflows7–14, and in 2021 the first Food and Drug Administration-approved AI pathology system was launched10. However, given the incredible gains in performance of computer vision, a subfield of AI focused on images, more recent studies15–19 attempt to unlock new insights from routine WSIs and reveal undiscovered outcomes such as prognosis and therapeutic response20. If successful, such efforts would enhance the utility of hematoxylin and eosin (H&E)-stained WSIs and reduce reliance on specialized and often expensive immunohistochemistry (IHC) or genomic testing21. Paige, New York, NY, US. 2Microsoft Research, Cambridge, MA, US. 3Memorial Sloan Kettering Cancer Center, New York, NY, US. 4NSW Health Pathology, St George Hospital, Sydney, New South Wales, Australia. 5University of Rochester, Rochester, NY, US. 6These authors contributed equally: Eugene Vorontsov, Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, Kristen Severson. e-mail: 1 Nature Medicine Article https://doi.org/10.1038/s41591-024-03141-0 a 119,629 patients Unique individuals represented in the data 208,815 cases Patient events requiring tissue samples 392,268 specimens Tissue samples 1,207,837 blocks Paraffin-embedded samples sliced for microscopy 1,488,550 H&E slides Diagnostic sample tens of thousands of square pixels after digitization ≥2.2% neoplasm b Cancer 38.0% Precursor 8.0% Benign 24.6% Unknown 29.4% Non-neoplasm ≥ 7.3% c d Resection 37% Lymph node 16.6% Lung 6.1% Bladder 5.5% Prostate 3.7% Skin 18.4% Biopsy 63% e Colon 3.2% Ovary 3.2% Bone 2.7% Peritoneum 0.4% Adrenal gland 0.2% Brain 0.8% Upper GI 2.2% Liver 3.2% Breast 24.9% Endometrium 3.4% Stomach 3.5% Pancreas 1.8% Foundation model Trained to embed tissue tiles in a basic representation that can be adapted to diverse tasks H&E slide f Tissue tiles 224 × 224 pixel crops crops from tissue regions in the slide Virchow Foundation model with ViT-H architecture (632 million parameters) trained using DINOv2 framework Adaptation Adapt aggregated tile embeddings to predict slide-level attributes across diverse tasks Aggregator Pan-cancer detection Tissue-agnostic cancer detection Pan-cancer subtyping Tissue-agnostic cancer subtyping Digital biomarker prediction Novel and replicative biomarker prediction H&E slide Tissue tiles Virchow Embeddings Fig. 1 | Overview of the study. The training dataset, training algorithm and application of Virchow, a foundation model for computational pathology. a, The training data can be described in terms of patients, cases, specimens, blocks or slides, as shown. b–d, The slide distribution as a function of cancer status (b), surgery (c) and tissue type (d). e, The dataflow during training requires processing the slide into tiles, which are then cropped into global and local views. f, Schematic of applications of the foundation model using an aggregator model to predict attributes at the slide level. GI, gastrointestinal. A major factor in the performance gains of computer vision models has been the creation of large-scale deep neural networks, termed foundation models. Foundation models are trained on enormous datasets—orders of magnitude greater than any used historically for computational pathology—using a family of algorithms, referred to as self-supervised learning (for example, refs. 22–26), which do not require Nature Medicine Article https://doi.org/10.1038/s41591-024-03141-0 curated labels. Foundation models generate data representations, called embeddings, that can generalize well to diverse predictive tasks27. This offers a distinct advantage over current diagnostic-specific methods in computational pathology, which, limited to a subset of pathology images, are less likely to reflect the full spectrum of variations in tissue morphology and laboratory preparations necessary for adequate generalization in practice. The value of generalization from large datasets is even greater for applications with inadequate quantities of data to develop bespoke models, as is the case for the detection of uncommon or rare tumor types, as well as for less common diagnostic tasks such as the prediction of specific genomic alterations, clinical outcomes and therapeutic response. A successful pathology foundation model should capture a broad spectrum of patterns, including cellular morphology, tissue architecture, staining characteristics, nuclear morphology, mitotic figures, necros (...truncated)