Coverage of oncology drug indication concepts and compositional semantics by SNOMED-CT.

AMIA Annual Symposium Proceedings, Aug 2024

To evaluate SNOMED-CT ‘s ability to represent simple and compositional concepts in FDA approved oncology drug indications.Oncology drug indications were decomposed into single and compositional concepts. SNOMED-CT’s coverage of single ...

Article PDF cannot be displayed. You can download it here:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480079/pdf/

Coverage of oncology drug indication concepts and compositional semantics by SNOMED-CT.

Coverage of Oncology Drug Indication Concepts and Compositional Semantics by SNOMED-CT® Steven H. Brown MD1, 2, Brent A. Bauer, MD3 , Dietlind L. Wahner-Roedler MD3 , Peter L. Elkin MD3. 1Department of Veterans Affairs, 2 Vanderbilt University, 3 Mayo Clinic Objective: To evaluate SNOMED-CT ‘s ability to represent simple and compositional concepts in FDA approved oncology drug indications. Methods : Oncology drug indications were decomposed into single and compositional concepts. SNOMED-CT’s coverage of single concepts and the semantics needed to create compositional concepts were evaluated using automated and manual techniques. Results: SNOMED-CT covered 86.3% of single concepts present in oncology drug indications; 11.3% of indications were covered completely. Coverage was best for concepts describing diseases, anatomy, and patient characteristics. Medications accounted for 50.5% of missing concepts. Excluding drug names, 45.2% of indications were completely represented. SNOMED -CT’s semantics completely represented 60.1% of compositional expressions. Conclusions: SNOMED -CT’s overall coverage of the concepts in oncology drug indications was good. Improvements or alternatives are needed for medications and semantics. Content coverage studies are not new to the literature. For example, in 1977 Lowery et al examined ICD, SNOMED, and the Cardiff system for coding congenital malformations and genetic syndromes 7 . A number of subsequent content coverage studies further evaluated the SNOMED family of terminologies 8-14 . SNOMED -CT is a reference terminology created from the combination of SNOMED-RT and the National Health Service’s Clinical Terms version 315 . According to the July 2002 fact sheet, SNOMED -CT contained 333,000 concepts and approximately 1,000,000 “is a” semantic relationships. SNOMED -CT supports the composition of new terms through the combination of existing concepts. A national license for SNOMED-CT was being negotiated by the NLM at the time this manuscript was written. If this license agreement comes to pass, SNOMED -CT could become a defacto national standard. Thus, understanding the content coverage of SNOMED -CT is of particular importance at this time. Introduction In the past five years a number of papers detailing desirable characteristics of terminologies have been published. In 1998, Chute documented 11 characteristics that terminologies should have or evolve to have in order to meet important needs of health care 1 . Cimino’s 2 work from the same year described 12 “desiderata” synthesized from the literature of medical vocabulary research. ASTM E 2087-00, published in 2000, enumerated over 50 quality indicators for controlled health vocabularies 3 . ISO TS171174 carries forward the ideas in ASTM 2087 as an international technical specification. Two additional publications5, 6 advance our understanding of terminology quality indicators even further. While the guideline authors may disagree on certain fine points, the importance of content coverage is universally acknowledged. In our experience, the importance of content coverage is understood and accepted by technical and non-technical audiences alike. “Content, content, content” 2 delivers the message succinctly. Compositionality has been proposed and successfully demonstrated as an approach to improve content coverage16-18 . For example, post coordinated composition of UMLS concepts to represent problem statements has performed significantly better than UMLS concepts alone19 . The linkage of two or more concepts is typically achieved using a formal semantic that details the concepts’ relationship. For example, the concepts “enalapril” and “angiotensin converting enzyme inhibition” could be joined by the semantic relationship “has mechanism of action.” Post coordinating a terminology’s concepts via its semantics suggests another type of study: the content coverage of the linking semantics. We believe semantics are an important part of compositional terminologies. Others agree. For instance, Bakken evaluated SNOMED -CT’s semantics in a study of nursing diagnoses20 . In the current study, we evaluate SNOMEDCT’s ability to represent the content of a set of FDA approved oncology drug indications and perform a preliminary analysis of its semantics AMIA 2003 Symposium Proceedings − Page 115 Methods Approved oncology drug indications (table 1) were downloaded from the FDA Oncology Tools website21 . SNOMED -CT version 1.0 from the College of American Pathologists was employed. All downloaded indications were manually broken into single concepts and compositional concepts. Our method identified the shortest medically sensible compositional concepts within the indication. Expressions composed of two concepts (e.g. oral + capsule) were identified whenever possible. A second author verified each proposed compositional expression. Examples of single and compositional concepts identified within indications are given in table 1. Each single or compositional concept was categorized as relating to treatments, diseases, patients, medications, anatomy, or other. Only concepts that mentioned a specific medication were classified as medication related. Concepts referring to broad classes of medications were classified as treatment related. Descriptive statistics and tables documenting the most commonly occurring single and compositional expressions are presented in the results section. SNOMED -CT’s content coverage of the identified single concepts was measured in two phases. In the first phase, automated concept identification tools available in our lab19, 22 were applied to each indication. The output was an XML file containing the original indications and all mapped SNOMED concepts. Each indication concept to SNOMED concept mapping was manually reviewed for correctness. The indication concepts that were not mapped properly via the algorithmic approach were manually reviewed using the Mayo Vocabulary Server and Browser tool loaded with SNOMEDCT. In this manner, single concepts were determined to be present or absent. SNOMED -CT’s coverage of the semantics needed to form compositional expressions was evaluated by manual modeling. The single ‘best’ fitting semantic relation was used to link concepts forming each compositional expression. The adequacy of each semantic’s representation of the meaning of the compositional expression was judged by consensus of two reviewers to be 1) complete, 2) partial, or 3) inadequate. Results The FDA website contained 115 indications for 68 unique drugs. We identified 1527 concepts in the 115 indications. The mean number of concepts per indication was 13.3 (95% CI 12.0 – 14.2) with a range from 3 to 48. Table 1 shows two representative indications and the concepts identified within them. The ten most commonly found single concepts and their frequency of occurrence are: “Patients” (56), “Treatment” (44), “Cancer” (44), “Therapy” (34), “Combination” (34), “Metastatic” (25 (...truncated)


This is a preview of a remote PDF: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480079/pdf/
Article home page: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480079

S. Brown, B. Bauer, D. Wahner-Roedler, P. Elkin. Coverage of oncology drug indication concepts and compositional semantics by SNOMED-CT., AMIA Annual Symposium Proceedings, pp. 115,