The UMLS Semantic Network and the Semantic Web.

AMIA Annual Symposium Proceedings, Aug 2024

The Unified Medical Language System® (UMLS®) , an extensive source of biomedical knowledge developed and maintained by the US National Library of Medicine (NLM) is being currently used in a wide variety of biomedical applications. The ...

Article PDF cannot be displayed. You can download it here:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480032/pdf/

The UMLS Semantic Network and the Semantic Web.

The UMLS Semantic Network and the Semantic Web Vipul Kashyap, Ph.D. National Library of Medicine, Bethesda, Maryland The Unified Medical Language System (UMLS) , an extensive source of biomedical knowledge developed and maintained by the US National Library of Medicine (NLM) is being currently used in a wide variety of biomedical applications. The Semantic Network, a component of the UMLS is a structured description of core biomedical knowledge consisting of well defined semantic types and relationships between them. We investigate the expressiveness of DAML+OIL, a markup language proposed for ontologies on the Semantic Web, for representing the knowledge contained in the Semantic Network. Requirements specific to the Semantic Network, such as polymorphic relationships and blocking relationship inheritance are discussed and approaches to represent these in DAML+OIL are presented. Finally, conclusions are presented along with a discussion of ongoing and future work. INTRODUCTION The Unified Medical Language System (UMLS) project was initiated in 1986 by the U.S. National Library of Medicine (NLM). Its goal is to help health professionals and researchers use biomedical information from different sources1. It consists of three main knowledge repositories: (a) The UMLS Metathesaurus, which provides a common structure for more than 95 source biomedical vocabularies. It is organized by concept, which is a cluster of terms (e.g., synonyms, lexical variants, translations) with the same meaning. (b) The UMLS Semantic Network2, which categorizes these concepts through semantic types and relationships. (c) The SPECIALIST lexicon contains over 30,000 English words, including many biomedical terms. Information for each entry, including base form, spelling variants, syntactic category, inflectional variation of nouns and conjugation of verbs, is used by the lexical tools11. The 2002 version of the Metathesaurus contains 871,584 concepts named by 2.1 million terms. It also includes inter-concept relationships across multiple vocabularies, concept categorization, and information on concept cooccurrence in MEDLINE. The UMLS Semantic Network is highly suited for representation using DAML+OIL5 constructs as it has a rich semantic structure and an underlying metamodel consistent with the DAML+OIL specification. In this paper, we investigate the expressiveness of DAML+OIL constructs for representing the knowledge contained in the Semantic Network. The results of this work will also be applied to the UMLS Metathesaurus. DAML+OIL: AN ONTOLOGY LANGUAGE FOR THE SEMANTIC WEB The recognition of the key role that ontologies are likely to play in the future of the Web has led to the extension of Web markup languages in order to facilitate content description and the development of web ontologies, e.g., XML Schema7, RDF4 and RDF Schema8. However, more expressive power is both necessary and desirable in order to describe data in sufficient detail, and enable automated reasoning, e.g., determine semantic relationships between syntactically different terms. The DAML+OIL language5 is designed to describe the structure of a domain. It takes an object oriented approach, with the structure of the domain being described in terms of classes and properties. An ontology consists of a set of axioms that assert characteristics of these classes and properties. We now present a discussion on the various constructs in DAML+OIL with their foundations in Description Logics (DLs)9. DAML+OIL is, in essence equivalent to a very expressive DL, with a DAML+OIL ontology corresponding to a DL terminology. As in a DL, DAML+OIL classes can be names (URIs) or expressions. A variety of constructors (or operators) are provided for building class expressions. The expressive power of the language is determined by the class (and property) constructors provided, and by the kinds of axioms allowed. Table 1 summarizes the constructors used in DAML+OIL expressed using the standard DL syntax. In the RDF syntax, the expression Bacterium ∩ Virus would be written as: <daml:Class> <daml:intersectionOf rdf:parseType=”daml:collection”> <daml:Class rdf:about=”#Bacterium”/> <daml:Class rdf:about=”#Virus”/> </daml:intersectionOf> </daml:Class> The meanings of the first three constructors from Table 1 are just the standard boolean operators on classes. The oneOf constructor allows classes to be AMIA 2003 Symposium Proceedings − Page 351 defined by enumerating their members. The toClass and hasClass constructors correspond to slot constraints in a frame-based language. Table 1: DAML+OIL class constructors Constructor DL Syntax Example intersectionOf C1 ∩ … ∩ Cn Bacterium ∩ Animal unionOf C1 ∪ … ∪ Cn Bacterium ∪ Virus complementOf ¬C {x1,…, xn} ¬Plant ∀P.C ∀partOf.Cell hasClass ∃P.C ∃processOf.Organism hasValue ∃P.{x} ∃treatedBy{aspirin} minCardinalityQ ≥ n P.C ≥ 2 hasPart.Cell maxCardinalityQ ≤ n P.C = n P.C ≤ 1 hasPart.Tissue oneOf toClass cardinalityQ {aspirin, tylenol} = 1 partOf.Cell The class ∀P.C is the class, all of whose instances are related via the property P only to resources of type C, while the class ∃P.C is the class, all of whose instances are related via the property P to at least one resource of type C. The hasValue constructor is just shorthand for a combination of hasClass and oneOf. The minCardinalityQ, maxCardinalityQ and cardinalityQ constructors (known in DLs as qualified number restrictions) are generalizations of the hasClass and hasValue constructors. The class ≥ n P.C (≤ n P.C, = n P.C) is the class all of whose instances are related via the property P to at least (at most, exactly) n different resources of type C. The emphasis on different is because there is no unique name assumption wrt to resource names (URIs) and it is possible that many URIs could name the same resource. Table 2 (next page, bottom) summarizes the axioms allowed in DAML+OIL. These axioms make it possible to assert subsumption or equivalence wrt classes or properties, the disjointness of classes, the equivalence or non-equivalence of individuals (resources), and various properties of properties. A crucial feature of DAML+OIL is that subClassOf and sameClassAs axioms can be applied to arbitrary class expressions. The last two rows of Table 2 refer to DAML+OIL constructs domain/range, which identify the domain and range classes of the various properties. Their DL constructors are as shown. We shall discuss later in the paper, various approaches to represent domains and ranges and the impact it might have on the complexity of the reasoning process. DAML+OIL also allows properties of properties to be asserted. It is possible to assert that a property is unique (i.e., functional) and unambiguous (i.e., its inverse is functional). It is also possible to use inverse properties and assert that a property is transitive. DAML+OIL REPRESENTATION OF THE SEMANTIC NETWORK We now present a DAML+OIL representation of a sma (...truncated)


This is a preview of a remote PDF: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480032/pdf/
Article home page: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480032

V. Kashyap. The UMLS Semantic Network and the Semantic Web., AMIA Annual Symposium Proceedings, pp. 351,