Post-coordination: Making things up as you go along
Pre- and post-coordination are notions that came out of SNOMED and concerns how the vast number of terms needed for coding, in SNOMED’s case, medical records; can one enumerate all needed terms and put them in the right place in a structured vocabulary before use, or does one put in a mechanism for creating terms as and when they are needed and putting them in the right place in the vocabulary’s structure? A pre-coordinated ontology has all the terms and relationships between them needed by an application; it is static; ‘what you see is what you get’. A post-coordinated ontology has the building blocks for the terms needed in an application such that they can be built as required, and so that their relationship can be determined as required; it is dynamic; the ontology is much more than ‘what you see’, since you can compose new expressions from the given building blocks. An ontology built using the Web Ontology Language (OWL) can take a compositional approach; it is built/can be used by composing classes and properties to make other classes. For example, an expression for large hydrophilic amino acid could be composed from the classes
Hydrophilic amino acid and
Large amino acid; in turn, these can themselves be composed of
Amino acid and various qualities of amino acids. In this approach a reasoner can be used to determine the relationship between on-the-fly built expressions and the classes from the base ontology, i.e., expressions can be classified and placed at the correct location in the class hierarchy. This KBlog describes post-cordination, distinguishes it from pre-coordination, and discusses when post-cordination can be used, either at build time or delivery time within an application.
Robert Stevens and Uli Sattler
Bio-health Informatics and Information Management Groups
School of Computer Science
University of Manchester
The notion of pre- and post-coordination (10.1016/j.jbi.2011.10.002) is used in terminologies such as <a href="http://en.wikipedia.org/wiki/SNOMED_CT#Pre-and_postcoordination”>SNOMED CT and is used to manage the vast number of terms required for coding medical records, but without having to enumerate them all. Though arising from SNOMED, the idea of pre- and post-coordination is widely applicable wherever ontologies are used to describe things and it is likely that not all the desired terms can be made before use. To illustrate, assume we have a (possibly very long) list of expressions
e3,… that we want to use in a given application, e.g., to label documents. Large hydrophilic amino acid is an example of such an expression. Also, assume that we have an ontology, vocabulary, or similar, called
O, for these terms. Now _pre-coordination and post-coordination relate to the following questions:
Ocontain a term for each of the expressions
ejI want to use?
- Or can I build a legal expression using building blocks from
Ofor each of the expressions
ejI want to use?
Ocapture all the relevant relations between the expressions
ek; e.g., that
ejis a specialisation of
If we answer the first and third question with yes, then we can say that
O is pre-coordinated. If we answer the second and third question with yes, then
O can be post-coordinated, and the degree to which it is depends on the number of expressions
ej for which
O does not contain a single term, but requires the construction of a suitable expression. Finally,
O can be both pre-coordinated and post-coordinated: e.g.,
O may have a term for large hydrophilic amino acid, but may also be able to handle the expression
Hydrophilic amino acid and Large amino acid. This can be illustrated with some examples using the Amino Acids Ontology. First of all, we can simply name a class of amino acid called
Class: Lysine SubClassOf: AminoAcid
Lysine is a named class, and we have already stated how it relates to
AminoAcid: it is a specialisation of it. We can further co-ordinate
Lysine with other classes in the ontology to describe
Lysine in terms of those classes, and we can do so for all 20 amino acids; descriptions of each of these can be composed from charge (positive, neutral or negative), polarity (polar and non-polar), size (tiny, small, medium and large) and hydrophobicity (hydrophilic or hydrophobic); here is this description for
Class: Lysine SubClassOf: AminoAcid, hasHydrophobicity some Hydrophilic, hasSideChainStructure some Aliphatic, hasCharge some Positive, hasSize some Large, hasPolarity some Polar
We can also introduce terms for other classes of amino acids, e.g.:
Class: 'Positive amino acid' EquivalentTo: AminoAcid and hasCharge some Positive Class: 'Hydrophilic amino acid' EquivalentTo: AminoAcid and hasHydrophobicity some Hydrophilic
Using the resulting ontology, I have two choices:
- I can use a reasoner to determine the (so far implicit) relationships between the terms introduced in it; e.g., it will determine that
Lysineis a specialisation of
Positive amino acid. And I can then choose to add these relationships explicitly to my ontology, which would make it more pre-coordinated, but also possibly harder to maintain: if I find an error, say, in the definition of
Hydrophilic amino acid, I will have to fix this error as well as the inferences I have drawn and materialised from this error. Alternatively, I can leave these relationships implicit, which will make fixing errors easier, but will require a reasoner to determine these relationships.
- I can restrict myself to using terms specified in the ontology, i.e., named classes such as
Positive amino acid, or I can build expressions from these, e.g.,
Positive amino acid and Large amino acidor
Positive amino acid and hasHydrophobicity some Hydrophilic. In the latter case, we can say that I use the ontology in a post-coordinated way, and this would of course require the use of a reasoner to determine the relationship between the expressions used and the classes defined in my ontology.
In this sense, we should rather speak about using an ontology in a pre-/post-coordinated way, and note that using an ontology in a post-coordinated way requires a reasoner (or similar tool) to determine the relation between the freshly made up expressions and those specified in the ontology. Similarly, we can say that an annotation tool supports post-coordination if it allows annotations in the form of expressions, and is able to determine the relationships between these expressions.
If we know that we are going to use an ontology in a post-coordinated way, then we know that we don’t have to introduce terms/named classes for each expression that we ever want to use – we can make them up given the base vocabulary from the ontology. As a consequence, we can build our ontology
- with fewer class names: we may choose to define a class name for
Positive amino acidbecause it’s a commonly used term, but we may also choose to not give a name to
Large Positive amino acid,
Large Positive Hydrophilic amino acid, …
- with a clear structure: its dimensions reflect the application area’s dimension and can be used to compose relevant terms
- without a combinatorial explosion of terms introduced: consider, e.g., an ontology of diseases with dimension location (in some bodypart), cause (accident, infection, genetic,…), status (chronic, acute,…), etc., and imagine we had to introduce a term for each possible combination. In contrast, using an ontology in a post-coordinated way, we can introduce some prominent names, e.g., congenital heart disease, but leave others to post-coordination, e.g., fracture to the fibula caused by an accident involving a bicycle.
So, in a nutshell, one gets classes on demand, together with their relationships. The ontology provides the building blocks and a class that gives a description is made when it’s needed. Of course, one can judge when it is worth naming a class and putting it in the ontology – when it’s frequently used or used as a part of another expression etc etc. By not putting all possible classes into an ontology one saves space, clutter, increases comprehensiblity etc.
Of course, it can’t all be positive: If an ontology is used in a post-coordinated way,
- we have to make more decisions: which classes do we name (or for which expressions do we introduce terms)?
- we need to use a reasoner to determine the relationship between two class expressions (or a class expression and class names). This can be a bit tricky to set up (though the OWL API should help) and may cause worries regarding performance (but tremendous progress has recently been made w.r.t. reasoner performance).
- we may want a single, unique ID for each term: e.g., if I have defined a class
Positive amino acidin my ontology, and then use this ontology in a post-coordinated way, I can of course use
AminoAcid and hasCharge some Positivein my annotation. The reasoner will determine that they are equivalent – but the annotation looks different, so I may have to be more careful about dealing with these annotations.
That’s all we can think of.
The act of composing one class with others and then linking it to other classes is coordination. This can be done exclusively at the time of building the ontology, which we can then use in a pre-coordinated way. Alternatively, an OWL ontology can be used in some software, e.g., to deal with document annotations, together with an automated reasoner. This means that classes can be composed or coordinated on the fly, with the reasoner placing the newly minted class in the appropriate place in the ontology’s hierarchy. In this case, this is post-coordination.