SliDL: A toolbox for processing whole-slide images in deep learning

PLOS ONE, Aug 2023

The inspection of stained tissue slides by pathologists is essential for the early detection, diagnosis and monitoring of disease. Recently, deep learning methods for the analysis of whole-slide images (WSIs) have shown excellent performance on these tasks, and have the potential to substantially reduce the workload of pathologists. However, WSIs present a number of unique challenges for analysis, requiring special consideration of image annotations, slide and image artefacts, and evaluation of WSI-trained model performance. Here we introduce SliDL, a Python library for performing pre- and post-processing of WSIs. SliDL makes WSI data handling easy, allowing users to perform essential processing tasks in a few simple lines of code, bridging the gap between standard image analysis and WSI analysis. We introduce each of the main functionalities within SliDL: from annotation and tile extraction to tissue detection and model evaluation. We also provide ‘code snippets’ to guide the user in running SliDL. SliDL has been designed to interact with PyTorch, one of the most widely used deep learning libraries, allowing seamless integration into deep learning workflows. By providing a framework in which deep learning methods for WSI analysis can be developed and applied, SliDL aims to increase the accessibility of an important application of deep learning.

SliDL: A toolbox for processing whole-slide images in deep learning

PLOS ONE RESEARCH ARTICLE SliDL: A toolbox for processing whole-slide images in deep learning Adam G. Berman, William R. Orchard, Marcel Gehrung, Florian Markowetz* Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, United Kingdom * a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 OPEN ACCESS Citation: Berman AG, Orchard WR, Gehrung M, Markowetz F (2023) SliDL: A toolbox for processing whole-slide images in deep learning. PLoS ONE 18(8): e0289499. https://doi.org/ 10.1371/journal.pone.0289499 Editor: Carlos Fernandez-Lozano, University of A Coruña, SPAIN Received: March 27, 2023 Accepted: July 20, 2023 Published: August 7, 2023 Copyright: © 2023 Berman et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: The source code of SliDL is freely available at a public repository: https://github.com/markowetzlab/slidl. The source code of a comprehensive SliDL tutorial is also freely available at the following public repository: https://github.com/markowetzlab/slidl-tutorial. That repository also contains the code used to train SliDL’s deep tissue detector in its deep_tissue_detector subdirectory. Complete documentation of SliDL including its application public interface (API) reference is available at the following URL: https://linkprotect.cudasvc.com/ url?a=https%3a%2f%2fslidl.readthedocs.io%2fen Abstract The inspection of stained tissue slides by pathologists is essential for the early detection, diagnosis and monitoring of disease. Recently, deep learning methods for the analysis of whole-slide images (WSIs) have shown excellent performance on these tasks, and have the potential to substantially reduce the workload of pathologists. However, WSIs present a number of unique challenges for analysis, requiring special consideration of image annotations, slide and image artefacts, and evaluation of WSI-trained model performance. Here we introduce SliDL, a Python library for performing pre- and post-processing of WSIs. SliDL makes WSI data handling easy, allowing users to perform essential processing tasks in a few simple lines of code, bridging the gap between standard image analysis and WSI analysis. We introduce each of the main functionalities within SliDL: from annotation and tile extraction to tissue detection and model evaluation. We also provide ‘code snippets’ to guide the user in running SliDL. SliDL has been designed to interact with PyTorch, one of the most widely used deep learning libraries, allowing seamless integration into deep learning workflows. By providing a framework in which deep learning methods for WSI analysis can be developed and applied, SliDL aims to increase the accessibility of an important application of deep learning. Introduction In histopathology, tissue biopsies are fixed, embedded, sectioned, stained, and placed on a glass slide before being examined under a microscope. Examination of tissue slides to identify pathologically relevant features has been an essential tool for early detection, diagnosis and disease monitoring in medical practice and research for decades. Pathological features can be anything from the presence or absence of certain cell types or populations, changes in cellular or nuclear morphology, changes in the arrangement of cells in a tissue, to changes in the intensity of certain tissue stains. Until recently only expert pathologists have been able to perform this task, requiring years of training, and with individual slides often having to be evaluated by multiple pathologists before a judgement can be made [1]. However, with a shift towards digitisation in pathology, tissue-slides are now routinely scanned to produce high-resolution whole-slide images (WSIs). Such images are amenable to automated image analysis and in the last decade the field has undergone a revolution. Deep learning methods for image analysis PLOS ONE | https://doi.org/10.1371/journal.pone.0289499 August 7, 2023 1 / 25 PLOS ONE %2flatest%2f&c=E,1,S9zTxKndcPNyESUe0QhuCcota0vA0CSPg9ZEs39y3UNkOIHweCHY-B2ogn Y52rkVtjub0msWdNm276Yj52DPMFfVPVXx3En7cCLNKYvFHAcgCogA,,&typo=1. The CAMELYON-16 WSIs and corresponding annotations used in the SliDL tutorial are freely available and can be downloaded by following the instructions at: https://github.com/markowetzlab/ slidl-tutorial. The WSIs related to the deep tissue detector can be accessed at https://linkprotect. cudasvc.com/url?a=https%3a%2f%2fdoi.org% 2f10.5281%2fzenodo.7947380&c=E,1,rsogzNLylI HJ4goNx4QP3CJ3g6vTURO4JhL0M9GdLRdapBLR-DOe0UoTPy6exTung3_MGTjeFNl8y lJcaXF0wIpT89JgjVD4p38UYY91jVClmkF&typo=1. Funding: This research was supported by Cancer Research UK (FM: C14303/A17197, https://www. cancerresearchuk.org/). A.G.B. acknowledges support from a Gates Cambridge Scholarship from the Bill & Melinda Gates Foundation (https://www. gatescambridge.org/). W.R.O. acknowledges support from a Peterhouse Studentship from Peterhouse, Cambridge (https://www.pet.cam.ac. uk/). M.G. acknowledges support from an Enrichment Fellowship from the Alan Turing Institute (https://www.turing.ac.uk/work-turing/ studentships/enrichment). F.M. is a Royal Society Wolfson Research Merit Award holder (https:// royalsociety.org/grants-schemes-awards/grants/ wolfson-research-merit/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: M.G. is an employee and shareholder of Cyted Ltd. F.M. is a co-founder and director of Tailor Bio. This does not alter our adherence to PLOS ONE policies on sharing data and materials. Abbreviations: API, Application Public Interface; BEST2, Barrett’s oEsophagus Screening Trial 2 [48]; OCCAMS, Oesophageal Cancer Clinical and Molecular Stratification [49]; TCGA, The Cancer Genome Atlas [50]; WSI, Whole-Slide Image. SliDL: A toolbox for processing whole-slide images in deep learning have shown excellent performance on diagnostic tasks [1–3], rivalling that of pathologists and further stimulating efforts to digitise glass slides. Pathologists have high inter-observer concordance rates on some diagnostic tasks, but in others they frequently disagree [4]. This is compounded by high workload, necessitating rapid screening of individual cases, increasing the risk of introducing diagnostic errors [5]. Deep learning methods are fast, often requiring only a few minutes to evaluate a slide, and give consistent evaluations. Thus, deep learning has the potential to substantially reduce the workload of pathologists, improve the inter-observer concordance rates and accelerate the evaluation of tissue-slides. The application of deep learning to pathological datase (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0289499&type=printable
Article home page: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0289499

Adam G. Berman, William R. Orchard, Marcel Gehrung, Florian Markowetz. SliDL: A toolbox for processing whole-slide images in deep learning, PLOS ONE, 2023, Volume 18, Issue 8, DOI: 10.1371/journal.pone.0289499