Deep learning features encode interpretable morphologies within histological images (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41598-022-13541-2.pdf

Deep learning features encode interpretable morphologies within histological images

www.nature.com/scientificreports OPEN Deep learning features encode interpretable morphologies within histological images Ali Foroughi pour1, Brian S. White1, Jonghanne Park1, Todd B. Sheridan1,2 & Jeffrey H. Chuang1,3* Convolutional neural networks (CNNs) are revolutionizing digital pathology by enabling machine learning-based classification of a variety of phenotypes from hematoxylin and eosin (H&E) whole slide images (WSIs), but the interpretation of CNNs remains difficult. Most studies have considered interpretability in a post hoc fashion, e.g. by presenting example regions with strongly predicted class labels. However, such an approach does not explain the biological features that contribute to correct predictions. To address this problem, here we investigate the interpretability of H&E-derived CNN features (the feature weights in the final layer of a transfer-learning-based architecture). While many studies have incorporated CNN features into predictive models, there has been little empirical study of their properties. We show such features can be construed as abstract morphological genes (“mones”) with strong independent associations to biological phenotypes. Many mones are specific to individual cancer types, while others are found in multiple cancers especially from related tissue types. We also observe that mone-mone correlations are strong and robustly preserved across related cancers. Importantly, linear mone-based classifiers can very accurately separate 38 distinct classes (19 tumor types and their adjacent normals, AUC = 97.1% ± 2.8% for each class prediction), and linear classifiers are also highly effective for universal tumor detection (AUC = 99.2% ± 0.12%). This linearity provides evidence that individual mones or correlated mone clusters may be associated with interpretable histopathological features or other patient characteristics. In particular, the statistical similarity of mones to gene expression values allows integrative mone analysis via expression-based bioinformatics approaches. We observe strong correlations between individual mones and individual gene expression values, notably mones associated with collagen gene expression in ovarian cancer. Mone-expression comparisons also indicate that immunoglobulin expression can be identified using mones in colon adenocarcinoma and that immune activity can be identified across multiple cancer types, and we verify these findings by expert histopathological review. Our work demonstrates that mones provide a morphological H&E decomposition that can be effectively associated with diverse phenotypes, analogous to the interpretability of transcription via gene expression values. Our work also demonstrates mones can be interpreted without using a classifier as a proxy. Deep learning has become an important methodology for analyzing biomedical images, and in particular for analyzing hematoxylin and eosin (H&E) stained whole slide images (WSIs). Deep neural networks have achieved classification accuracies higher than classical machine learning models1. However, they are black-boxes that do not directly reveal the morphological features they associate with labels, a significant concern for mechanistic analysis and clinical decision m aking2. Identification of biologically meaningful morphological features may be confounded by image a rtifacts3, such as blurring, noise, and lossy image c ompression4. Tissue damage, image quality, and dataset-specific artifacts have also been suggested to affect feature representation and prediction accuracy of neural n etworks1,5,6. Given the impact of such artifacts on deep learning-based predictors, it is of critical importance to be able to decompose CNNs into features that can be biologically interpreted. The majority of models for visualizing, analyzing, and interpreting CNNs reveal “where” a network is “looking” to make its prediction, rather than revealing “what” information in the region of interest is important. Some methods output pixel patterns that affect the value of a neuron in a deep n etwork7. However, such techniques 1 The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT 06032, USA. 2Department of Pathology, Hartford hospital, 80 Seymour St, Hartford, CT 06106, USA. 3Department of Genetics and Genome Sciences, UCONN Health, Farmington, CT 06032, USA. *email: Scientific Reports | (2022) 12:9428 | https://doi.org/10.1038/s41598-022-13541-2 1 Vol.:(0123456789) www.nature.com/scientificreports/ tend to output different predictive regions, can be difficult to validate, or have been suggested to be “fragile”, i.e. extremely sensitive to small perturbations of the image8. Optimizing conventional deep learning techniques, such as self-attention, to identify regions informative of class labels is a current theme in digital p athology9,10. While most methods assess deep feature representations as a whole, recent work suggests deep learning features cluster together and encode distinct m orphologies11. Other recent works have focused on visualizing individual deep learning features as heatmaps12. Finally, as the majority of interpretation methods have focused on identifying regions predictive of class labels, they requires a trained classifier and cannot be directly used in pipelines that employ unsupervised feature learning. Unlike natural image a nalysis13, biomedical image analysis is complemented by additional data modalities, such as multiplexed imaging, single cell and bulk sequencing, and clinical information14,15. These data may aid in interpreting the deep feature representations of the H&E slide. However, models integrating these diverse modalities are needed. The feasibility of doing so is supported by work establishing the connection between modalities, for example by using CNNs to predict expression values of specific genes from H&E i mages16–18. Because of the architectural complexity of CNNs, it has often been assumed that CNN-based decompositions of images into features are not interpretable. However, there has been little empirical study of this question, e.g. by testing whether CNN-derived features are correlated with simple biological features such as gene expression values. In this work, we investigate the interpretability of CNN-derived image features. Prior works1,19 have referred to these by various names (e.g. features, fingerprints ) whose use is not specific to biological image analysis. For clarity and because they represent morphological features in many ways analogous to genes, we refer to them as mones (i.e. “morphological genes”). We find that mones share statistical similarities with gene expression data, and hence, a mone can be conceptualized as an abstract gene with some expression value. Individual mones have strong linear associations with phenotypic features, making them directly interpretable, which we demonstrate in several analyses. We demonstrate that many mones can distinguish cancer tu (...truncated)