on March 28, 2012 by Robert Stevens in Under Review, Comments (0)

Managing synonomy in OWL

Overview

This article describes approaches to dealing with synonomy in ontologies written in the Web Ontology Language (OWL). synonomy is an important issue in ontologies as one of the roles of an ontology is to help manage the vocabulary used for entities in a field of interest. As synonomy is rife, understanding how to deal with the phenomenon in ontologies written in OWL is important. There is a choice between managing synonomy through the use of labels on classes or via equivalence, to state that two extents are the same. The article indicates where the choice should be made and illustrates with examples.

Authors

Robert Stevens
bioHealth Informatics Group
School of Computer Science
University of Manchester
United Kingdom
M13 9PL

Phillip Lord
School of Computing Science
Newcastle University+ Newcastle
United Kingdom
NE3 7RU

Wikipedia describes synonyms as words that have the same meaning or more or less the same meaning as another word. A typical biological example would be Erythrocyte and red blood cell; these two labels are inter-changeable and completely preserve meaning. The two labels have extents that are the same set of cells with the same definitions. Scientific English is rife with this sort of nomenclature difference; from the pseudo-Greek of erythrocyte to the more straight-forward red blood cell, and the spellings of sulphur and sulfur, haemoglobin and hemoglobin. There are also all the results of autonomous namings of proteins and genes such as seen in almost any Uniprot record for a protein sequence such as “LARD”.

These are different symbols used by scientists for the same concept (set of instances). In OWL there is a choice of how this is managed:

1. Through use of the equivalence axiom that says that two classes have the same instances.
2. Simply through labelling of the concept or class using anotation properties such as the skos:prefLabel and skos:altLabel of SKOS. Such annotations are part of editorial metadata that can occur in ontologies.

The choice between these two options is normally made on the basis of modelling intention. If the ontologist wishes to highlight that there are two names, or two labels for the same concept, then in general they will use the synonomy mechanisms supported via annotation properties in OWL. However, if the ontologist wishes to highlight that there appear to be two different definitions for the same concept, then they would use equivalence. This is a decision related to the distinction between a class and its extent.

We explain this further with a set of examples. In OWL the axiom A equivalentTo B states that the extents of the two classes are the same — they are equivalent. The statement

 (Polygon and hasPart exactly 3 sides) EquivalentTo: (Polygon and hasPart exactly 3 angles)

is an equivalence axiom; both classes represent all instances of triangles. However, they have different definitions: one using the concept of a side and one the concept of an angle. One definition, however, implies the other; for a polygon to have three angles, it must have three sides, and vice versa. Here we have different intentional definitions for the same extent of instances; we are capturing this in our ontology by having two definitions and marking them as equivalent.

We could also have:

 Class: 000001 Annotations: label "Triangle", synonym "Polygon with three sides", synonym "Polygon with three angles" SubClassOf: Polygon and hasPart exactly 3 Sides

In this case, however, we have only formally captured the definition using three sides. The use of synonym labels here is, therefore, wrong. It is, essentially, mistaking the notion of a polygon with three angles as a label.

As well as being able to explicitly state equivalence, it is also possible to computationally reason that two definitions, in fact, are the same class. For example, in the Family History Knowledge Base (FHKB), a woman is defined as any person that has sex female; while a daughter is defined as being any woman that is the child of some person. The reasoner infers these to be equivalent; despite the different definitions, it is the case that all woman are indeeed daughters (of somebody!). The Amino Acids Ontology has some infered equivalence due to the closing of the world of amino acids to those found in biology.

We can contrast this with examples of clear synonomy — where one definition for a concept simply has multiple labels. A simple example is “acetic acid” and “ethanoic acid”; “femur” and “thigh bone”; and the Gene Ontology ID GO:0016049 has the label “cell growth ” and exact synonyms “cellular growth” and “growth of cell”.

One specialised use of synonomy is for internationalisation. The distinction between sulphur and sulfur is an example of dialectic variation between British and American English, while sulfur and zolfo is a language difference. While OWL uses the same mechanism to represent these, again, there is a distinction of intentionality. The labels are considered to be equivalent, without a prefered label, while synonomy often implies a preferred and non preferred term. We could, for example, write:

 Class: 000666 Annotations: SKOS:prefLabel "Common cold"@en, SKOS:prefLabel "Un raffreddore"@it, SKOS:altLabel "Bit of a sniffle"@en, SKOS:altLabel "essere costipato"@it

where we see differences in internationalisation and synonomy. Here we clearly have a simple case of multiple labels for the same concept, some down to different languages and some down to preferences within languages. OWL uses the language tags from the IETF to designate language for a label (it is possible to say which language to use in display in many tools). Unfortunately, it is not possible to define one’s own community tags that would enable community variants of labelling to be taken into account for presentation.

As always, there are some edge cases where it is less clear whether we see an example of synonomy. For example, compare the labels “Sulphur” and “Brimstone”. While, “brimstone” could be considered to be a synonymous term as “sulphur”, in modern parlance as it has become archaic, so it has gained religious connotations. Therefore, a priori it is unclear whether “brimstone” should be considered a synonym of sulphur, or a mythical substance of historical, rather than scientific interest.

From a functional point of view, the most important distinction between equivalence and synonymy is that it is possible to automatically reason the former and not the latter, which must always be asserted. It is possible, therefore, for two concepts to lose their equivalence. For example, if woman and the daughter are equivalent, this implies inductively that humans must have existed forever, which is not true. If we extended the Family History Knowledge Base to deal with evolutionary as well as historical time, then the equivalence of these concept would have to change also.

In summary, there is a relatively straight-forward choice to be made between wehther one has a case of labelling synonomy, with one label being preferred, and a case of differing intentional descriptions with the same extent. The former is managed, in OWL, with annotations (with a range of implementations possible) and the latter with equivalence axioms and, possibly, inference.