A foundation model for clinical-grade computational pathology and rare cancers detection
nature medicine
Article
https://doi.org/10.1038/s41591-024-03141-0
A foundation model for clinical-grade
computational pathology and rare
cancers detection
Received: 6 February 2024
Accepted: 19 June 2024
Published online: xx xx xxxx
Check for updates
Eugene Vorontsov 1,6, Alican Bozkurt1,6, Adam Casson1,6, George Shaikovski1,6,
Michal Zelechowski1,6, Kristen Severson 2,6, Eric Zimmermann2, James Hall2,
Neil Tenenholtz 2, Nicolo Fusi 2, Ellen Yang3, Philippe Mathieu1,
Alexander van Eck1, Donghun Lee1, Julian Viret1, Eric Robert 1, Yi Kan Wang1,
Jeremy D. Kunz1, Matthew C. H. Lee1, Jan H. Bernhard1, Ran A. Godrich1,
Gerard Oakley1, Ewan Millar4, Matthew Hanna3, Hannah Wen3,
Juan A. Retamero1, William A. Moye1, Razik Yousfi1, Christopher Kanan 1,5,
David S. Klimstra1, Brandon Rothrock 1, Siqi Liu 1 & Thomas J. Fuchs1
The analysis of histopathology images with artificial intelligence aims
to enable clinical decision support systems and precision medicine. The
success of such applications depends on the ability to model the diverse
patterns observed in pathology images. To this end, we present Virchow,
the largest foundation model for computational pathology to date. In
addition to the evaluation of biomarker prediction and cell identification,
we demonstrate that a large foundation model enables pan-cancer
detection, achieving 0.95 specimen-level area under the (receiver operating
characteristic) curve across nine common and seven rare cancers.
Furthermore, we show that with less training data, the pan-cancer detector
built on Virchow can achieve similar performance to tissue-specific
clinical-grade models in production and outperform them on some rare
variants of cancer. Virchow’s performance gains highlight the value of a
foundation model and open possibilities for many high-impact applications
with limited amounts of labeled training data.
Pathologic analysis of tissue is essential for the diagnosis and treatment of cancer. Increasingly, the traditional histological preparations
used for light microscopy examination are being replaced by their
digital counterparts, also known as whole-slide images (WSIs), which
enables the use of computational pathology1–4 to move from primarily academic proof points to routine tools in clinical practice. Computational pathology applies artificial intelligence (AI) to digitized
WSIs to support the diagnosis, characterization and understanding of
disease5,6. Initial work has focused on clinical decision support tools
to enhance current workflows7–14, and in 2021 the first Food and Drug
Administration-approved AI pathology system was launched10. However, given the incredible gains in performance of computer vision,
a subfield of AI focused on images, more recent studies15–19 attempt
to unlock new insights from routine WSIs and reveal undiscovered
outcomes such as prognosis and therapeutic response20. If successful, such efforts would enhance the utility of hematoxylin and eosin
(H&E)-stained WSIs and reduce reliance on specialized and often expensive immunohistochemistry (IHC) or genomic testing21.
Paige, New York, NY, US. 2Microsoft Research, Cambridge, MA, US. 3Memorial Sloan Kettering Cancer Center, New York, NY, US. 4NSW Health Pathology,
St George Hospital, Sydney, New South Wales, Australia. 5University of Rochester, Rochester, NY, US. 6These authors contributed equally: Eugene Vorontsov,
Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, Kristen Severson. e-mail:
1
Nature Medicine
Article
https://doi.org/10.1038/s41591-024-03141-0
a
119,629 patients
Unique individuals
represented in the data
208,815 cases
Patient events requiring
tissue samples
392,268 specimens
Tissue samples
1,207,837 blocks
Paraffin-embedded samples
sliced for microscopy
1,488,550 H&E slides
Diagnostic sample tens of
thousands of square pixels after
digitization
≥2.2%
neoplasm
b
Cancer
38.0%
Precursor
8.0%
Benign
24.6%
Unknown
29.4%
Non-neoplasm
≥ 7.3%
c
d
Resection
37%
Lymph node
16.6%
Lung
6.1%
Bladder
5.5%
Prostate
3.7%
Skin
18.4%
Biopsy
63%
e
Colon
3.2%
Ovary
3.2%
Bone
2.7%
Peritoneum
0.4%
Adrenal gland
0.2%
Brain
0.8%
Upper GI
2.2%
Liver
3.2%
Breast
24.9%
Endometrium
3.4%
Stomach
3.5%
Pancreas
1.8%
Foundation model
Trained to embed tissue tiles in a basic representation that can be adapted to diverse tasks
H&E slide
f
Tissue tiles
224 × 224 pixel crops crops from
tissue regions in the slide
Virchow
Foundation model with ViT-H architecture (632 million
parameters) trained using DINOv2 framework
Adaptation
Adapt aggregated tile embeddings to predict slide-level attributes across diverse tasks
Aggregator
Pan-cancer detection
Tissue-agnostic cancer detection
Pan-cancer subtyping
Tissue-agnostic cancer subtyping
Digital biomarker prediction
Novel and replicative biomarker prediction
H&E slide
Tissue tiles
Virchow
Embeddings
Fig. 1 | Overview of the study. The training dataset, training algorithm and
application of Virchow, a foundation model for computational pathology.
a, The training data can be described in terms of patients, cases, specimens,
blocks or slides, as shown. b–d, The slide distribution as a function of cancer
status (b), surgery (c) and tissue type (d). e, The dataflow during training requires
processing the slide into tiles, which are then cropped into global and local views.
f, Schematic of applications of the foundation model using an aggregator model
to predict attributes at the slide level. GI, gastrointestinal.
A major factor in the performance gains of computer vision models has been the creation of large-scale deep neural networks, termed
foundation models. Foundation models are trained on enormous
datasets—orders of magnitude greater than any used historically for
computational pathology—using a family of algorithms, referred to as
self-supervised learning (for example, refs. 22–26), which do not require
Nature Medicine
Article
https://doi.org/10.1038/s41591-024-03141-0
curated labels. Foundation models generate data representations,
called embeddings, that can generalize well to diverse predictive tasks27.
This offers a distinct advantage over current diagnostic-specific methods in computational pathology, which, limited to a subset of pathology
images, are less likely to reflect the full spectrum of variations in tissue
morphology and laboratory preparations necessary for adequate generalization in practice. The value of generalization from large datasets
is even greater for applications with inadequate quantities of data to
develop bespoke models, as is the case for the detection of uncommon
or rare tumor types, as well as for less common diagnostic tasks such as
the prediction of specific genomic alterations, clinical outcomes and
therapeutic response. A successful pathology foundation model should
capture a broad spectrum of patterns, including cellular morphology, tissue architecture, staining characteristics, nuclear morphology,
mitotic figures, necros (...truncated)