Is a class the same as its extent?
While it is true that an ontology class defines a set of individuals, its extent, the reverse is not so. The class is not the same as its extent. In this article, we consider why this is not so when modelling the real world, and also within the context of the formal semantics of OWL.
School of Computing Science,
School of Computer Science
An ontology consists of a number of classes; these logically divide the world up into sets of individuals. This set of individuals is known as the extent of the ontology. In this article, we consider whether the reverse is true; is it the case that this set of individuals define the class? We consider this in two ways. First, we investigate the semantics of OWL, the Web Ontology Language. Second, we use a biological example, to clarify why there needs to be a separation between the class and its extent.
One of the key features of OWL is that it has a well-defined semantics. In general, this formal semantics is most important when building computational tools to reason over OWL ontologies; the precisely defined meaning to statements makes it possible for independent tools to come to identical and clearly defined conclusions. In this article, we consider this semantics informally, and the implications that this has for the meaning of a class in OWL.
Different users of OWL tend to think of the statements made within the ontology in different ways. Consider a very simple OWL statement such as:
Class: B SubClassOf: A
The simplest interpretation of these statements is “if you are a B then you are also an A“. A slightly more formal logical interpretation is “given B subclass of A, any instance b of B is also an instance of A”. Alternatively interpretation is that “B subclass of A, implies that the set of instances of B is a subset of the instances of A”. These definitions are all equivalent. We can use the set-theoretic semantics to define two further parts of OWL. owl:Thing can be interpreted as a set containing all possible instances, while a class that equivalent to the empty or null set, which contains no instances is “unsatisfiable”; these are often called “inconsistent” classes, although strictly this is a property of an ontology with at least one unsatisfiable class.
At first sight, therefore this set-theoretic interpretation appears to imply that OWL class are extensional; that is, they are defined purely by their membership. However, there is an added complication which alters this conclusion
The set-theoretic interpretation of OWL is made with respect to mathematical universe; this is not the real universe of things around us, but a collection of all the mathematical individuals that we wish to consider. For a given ontology, there are many potential different mathematical universes; again, this simply means that we can consider different sets of individuals all of which obey the statements in the ontology.
In one universe, A might have 10 individuals and B might have 4. In another universe, both A and B might both be empty, having no individuals. In another, both A and B might contain all individuals. An OWL ontology allows us to distinguish between those universes where all classes are true, or are satisfied and those where this is not true. The former universes are known as models. There may be many, perhaps infinitely many, models. Conclusions or implications must be necessarily true in any model. That a model exists where neither A nor B has any instances does not make these classes unsatisfiable; for a class to be unsatisfiable there must be no model in which it can have individuals.
The motivation for this form of interpretation is lodged in OWLs “open world assumption” — things which are not stated are considered unknown. For a given ontology, it would be a mistake to interpret classes or subclass relations on the basis of their stated individuals. After all these are only individuals that we know about, and their could be others.
It is possible, within OWL, to define a class extensionally, using owl:oneOf; this defines all the individuals that are in the set of this class in any universe. This form of definition is very much the exception within an OWL ontology rather than the rule.
At first sight, it might appear appear that the formal semantics of OWL are not of relevance to modelling biology ontological, especially if a technology other than OWL is being used. In this case, there is only a single universe with a specific set of individuals, so the ideas of “in all possible universes” does not make sense. However, here we argue that the idea of potential individuals is still useful, and clearly demonstrates the distinction between a class and its extent.
Consider three terms ReceptorProtein, PhotoreceptorProtein and 7-transmembraneProtein. It seems fairly straight-forward to deduce that PhotoreceptorProtein is a subclass of ReceptorProtein. By definition, any protein which operates as a photoreceptor must by definition also operate as a receptor. Alternatively, in terms of set-theoretic semantics, the set of photoreceptors is a subset of receptors.
Most biologists will know of the relationship between the 7-transmembrane proteins and photoreceptors — the best known photoreceptor family is the extensive Opsin/Rhodopsin family whose members are found in organisms as disparate as humans and bacteria. Our quick survey of five biologists showed that all of them were aware of this. However, without recourse to external resources, none were able to say whether there were any photoreceptors which are not also 7-transmembrane proteins. Even, with recourse to external resources, they could only answer questions about proteins currently known, not about all proteins.
Considering next the ontology of these classes. If classes were defined simply by their extension, then we need an answer to this question. If there are no photoreceptors which are not also 7-transmembrane proteins, then the set of photoreceptors is a subset of 7-transmembrane proteins. Under these circumstances PhotoreceptorProtein would be a subclass of 7-transmembraneProtein. Unfortunately, it is difficult or impossible these questions.
However, if we think of classes in the same way as OWL, then instead we ask the question, whether it is possible that a protein might operate as a photoreceptor without being a 7-transmembrane protein. Most biologists would suggest that it is both reasonable and plausible suggest that this kind of protein might exist. Therefore, this suggests that PhotoreceptorProtein should not be a subclass of 7-transmembraneProtein.
By contrast, consider a roughly analoguous example of the two classes TransmembraneProtein and IonTransporterProtein. In this case, there are good grounds for suggesting that an ion transporter protein is also going to be a transmembrane protein; ions cannot directly pass through a lipid bilayer, and it hard to see who a protein could enable this without having access to both sides of a plasma membrane. There seems to be no possible protein which could be both. In this case, within an ontology it would seem reasonable that IonTransporterProtein should be either asserted or inferred to be a subclass of TransmembraneProtein.
The consideration of individuals which are possible, which might exist, is therefore useful when building an ontology. This is not to suggest that we need to take into account all possibilities when building an ontology. For instance, within most biomedical ontologies, we are not considering the real universe of all individuals, but a subset of it, which we describe as “non-pathological”. For example, an anatomical ontology might suggest that set of individual of the class Toes is a subset of the class part_of Foot. It can make this statement despite the fact that in the real world there are individuals without toes or who have toes which are not part of the foot, as they are not being considered in this case. By analogy with OWL semantics, they are outside the “universe”. We might also choose to restrict our universe to a single species — another common assumption in medical ontologies — which excludes many complex possibilities from consideration.
Finally, we can exclude possibilities that are just implausible; for example, when building a biomedical ontology, we would not need to worry about the possible existence of silcon based life, or other forms of alien existence. If, at some point in the future, either of these is discovered, we will need to change our ontologies; this is likely to be one of the smaller changes caused by this kind of discovery.
Although OWL uses a set-theoretic semantics, it is easy to misinterpret this as meaning that a class is defined by its extent. At first sight, the understanding that classes are defined by any possible extent within a given universe may seem overly complex, but it actually mirrors one process that can be used to determine subclass relations when modelling biological individuals: while a class will contain a specific set of individuals, it’s extent, we rarely know what all of these individuals are, and they may change in the future. Instead we need to decide which possible individuals a class might contain; it is this set that defines the meaning of the class.