Identification of long non-coding RNAs and RNA binding proteins in breast cancer subtypes

Scientific Reports, Jan 2022

Breast cancer is a heterogeneous disease classified into four main subtypes with different clinical outcomes, such as patient survival, prognosis, and relapse. Current genetic tests for the differential diagnosis of BC subtypes showed a poor reproducibility. Therefore, an early and correct diagnosis of molecular subtypes is one of the challenges in the clinic. In the present study, we identified differentially expressed genes, long non-coding RNAs and RNA binding proteins for each BC subtype from a public dataset applying bioinformatics algorithms. In addition, we investigated their interactions and we proposed interacting biomarkers as potential signature specific for each BC subtype. We found a network of only 2 RBPs (RBM20 and PCDH20) and 2 genes (HOXB3 and RASSF7) for luminal A, a network of 21 RBPs and 53 genes for luminal B, a HER2-specific network of 14 RBPs and 30 genes, and a network of 54 RBPs and 302 genes for basal BC. We validated the signature considering their expression levels on an independent dataset evaluating their ability to classify the different molecular subtypes with a machine learning approach. Overall, we achieved good performances of classification with an accuracy >0.80. In addition, we found some interesting novel prognostic biomarkers such as RASSF7 for luminal A, DCTPP1 for luminal B, DHRS11, KLC3, NAGS, and TMEM98 for HER2, and ABHD14A and ADSSL1 for basal. The findings could provide preliminary evidence to identify putative new prognostic biomarkers and therapeutic targets for individual breast cancer subtypes.

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41598-021-04664-z.pdf

Identification of long non-coding RNAs and RNA binding proteins in breast cancer subtypes

www.nature.com/scientificreports OPEN Identification of long non‑coding RNAs and RNA binding proteins in breast cancer subtypes Claudia Cava1*, Alexandros Armaos2,3, Benjamin Lang2,4, Gian G. Tartaglia2,3,5 & Isabella Castiglioni6 Breast cancer is a heterogeneous disease classified into four main subtypes with different clinical outcomes, such as patient survival, prognosis, and relapse. Current genetic tests for the differential diagnosis of BC subtypes showed a poor reproducibility. Therefore, an early and correct diagnosis of molecular subtypes is one of the challenges in the clinic. In the present study, we identified differentially expressed genes, long non-coding RNAs and RNA binding proteins for each BC subtype from a public dataset applying bioinformatics algorithms. In addition, we investigated their interactions and we proposed interacting biomarkers as potential signature specific for each BC subtype. We found a network of only 2 RBPs (RBM20 and PCDH20) and 2 genes (HOXB3 and RASSF7) for luminal A, a network of 21 RBPs and 53 genes for luminal B, a HER2-specific network of 14 RBPs and 30 genes, and a network of 54 RBPs and 302 genes for basal BC. We validated the signature considering their expression levels on an independent dataset evaluating their ability to classify the different molecular subtypes with a machine learning approach. Overall, we achieved good performances of classification with an accuracy >0.80. In addition, we found some interesting novel prognostic biomarkers such as RASSF7 for luminal A, DCTPP1 for luminal B, DHRS11, KLC3, NAGS, and TMEM98 for HER2, and ABHD14A and ADSSL1 for basal. The findings could provide preliminary evidence to identify putative new prognostic biomarkers and therapeutic targets for individual breast cancer subtypes. Abbreviations BC Breast cancer lncRNAs Long non-coding RNAs RBPs RNA binding-proteins TCGA The Cancer genome atlas NS Normal samples DEGs Differentially expressed genes Breast cancer (BC) is one of the most common cancers around the world and was estimated the most frequent cancer among women (25% of all new cancers recorded)1. The heterogeneity of BC reduces the specificity of biological features (e.g., histological grade and hormone receptor status) which are usually utilized for the diagnosis and prognosis of BC and to address a therapy2,3. The classification of biological BC subtypes is based on the use of techniques such as immunohistochemistry and gene expression p rofiling4. In 2011 The St. Gallen International Breast Cancer Conference reported a molecular subtype approach to guide the therapy of BC based on immunohistochemical markers: estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2)4. In addition to the detection of these standard biomarkers, St. Gallen in 2013 included the evaluation of a marker of cell proliferation: Ki-675. Luminal A is 1 Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F.Cervi 93, 20090 Segrate‑Milan, Milan, Italy. 2Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, C/ Dr. Aiguader 88, 08003 Barcelona, Spain. 3RNA System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano Di Tecnologia (IIT), Via Morego 30, 16163 Genoa, Italy. 4Department of Structural Biology and Center for Data Driven Discovery (C3D), St. Jude Children’s Research Hospital, Memphis, TN 38105, USA. 5Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy. 6Department of Physics “Giuseppe Occhialini”, University of Milan-Bicocca Piazza dell’Ateneo Nuovo, 1 ‑ 20126, Milan, Italy. *email: Scientific Reports | (2022) 12:693 | https://doi.org/10.1038/s41598-021-04664-z 1 Vol.:(0123456789) www.nature.com/scientificreports/ defined by ER positive and/or PR positive and Ki-67 < 14%, and luminal B by ER positive and/or PR positive and Ki-67 ≥ 14%. ER negative, PR negative and Her2 positive tumors are classified as HER2 + 6. Triple negative BC (TNBC) are characterized by ER negative and PR negative and Her2 n egative6. The development of gene expression profiling with microarray demonstrated that the classification based on gene expression profiling reflects the differences of BC subtypes at the molecular level3. The pioneer study of Perou et al. in 2000 reported that BC could be classified into four intrinsic molecular subtypes by gene expression profiling: luminal A, luminal B, HER2-enriched (HER2), and basal7,8. Gene expression classifi-cation defines TNBC of immunohistochemistry with term basal BC. However, previous studies reported that there is a concordance of 80% between TNBC and basal B C9. Unlike the TNBC subtype, basal BC is characterized by the expression of other proteins, such as cytokeratins 5,6 and 1 710. BC molecular subtypes can be detected by different genetic tests with a different gene signature (e.g., PAM50, MammaPrint, and Oncotype DX). Several studies, applied to publicly available gene expression datasets, demonstrated a poor reproducibility among different genetic tests. This can be explained by the differences of gene signature in different genetic t ests11,12. These observations forced the research towards the discovery of new biomarkers to be used for BC subtype characterization. Luminal A is the most common BC subtype with a higher favorable prognosis and a slower evolution13. Luminal B subtype is characterized by an intermediate prognosis compared with luminal A and HER2 BC and an increased expression of genes associated with growth receptor s ignaling14. HER2 BC frequently tend to metastasize in the brain, liver and lung. In addition, the overexpression of HER2 is implicated in the cell proliferation, blocking apoptosis and cell s preading15. Basal BC subtype has a worse prognosis compared with other subtypes and high cell proliferation. Non-luminal tumors form metastases into distant organs more frequently than luminal tumors, but surprisingly luminal A and basal subtypes develop the regional lymph node metastases less often16,17. The luminal A is well differentiated compared to luminal B, HER2 and basal that are poorly differentiated17. Previous studies reported that the evolution from normal breast cell types to BC subtypes derives from mutations or genetic rearrangements in stem cells and progenitor cells giving rise to a heterogeneous population of cells18. New more accurate methods are needed to increase prognostic value and to personalize the most appropriate treatment for patients with BC and to investigate the molecular mechanisms responsible of BC subtypes differentiation. In the recent years Long Non-Coding RNAs (lncRNAs) and RNA binding-proteins (RBPs) emerged as key regulators of post-transcriptional events, and they are dysregulated in many human solid cancers, including BC19,20. LncRNAs, longer than 200 nucleotides in length, belong to a large class of noncoding RNAs and are (...truncated)


This is a preview of a remote PDF: https://www.nature.com/articles/s41598-021-04664-z.pdf
Article home page: https://www.nature.com/articles/s41598-021-04664-z

Cava, Claudia, Armaos, Alexandros, Lang, Benjamin, Tartaglia, Gian G., Castiglioni, Isabella. Identification of long non-coding RNAs and RNA binding proteins in breast cancer subtypes, Scientific Reports, DOI: 10.1038/s41598-021-04664-z