OWL, an ontology language
This article takes the reader on an introductory tour of OWL, with particular attention on the meaning of OWL statements, their entailments, and what reasoners do. Related Knowledge Blog posts include one on ontology components, one on OWL syntaxes, and one on the extent of classes.
There are numerous ontology languages around, most prominently the Web Ontology Language OWL. OWL has been developed based on experiences with its predecessors DAML+OIL and OIL, and its design has been carried out by W3C working groups. OWL 2 is an extension and revision of the OWL (published in 2004) and is a W3C recommendation.
OWL and OWL 2 are called Web Ontology Languages because they are based on web standards such as XML, IRIs, and RDF, and because they are designed in such a way that they can be used over the web (for example, one OWL file can import others by their URI). There are numerous usages of OWL and OWL 2, however, that are rather local, for example to a software or information system.
These languages come with a lot of options and choices, which we will only briefly mention here, and only come back to when they are important. OWL comes in three flavours (OWL Full, OWL lite, and OWL DL), and OWL 2 comes with two semantics (i.e., two ways of determining the meaning of an ontology, direct and RDF-based) and three profiles (i.e., fragments or syntactic restrictions, called OWL 2 EL, QL and RL), and you can choose between a number of syntaxes to save your ontology in. Since the tools and especially the reasoners around mostly support OWL 2’s direct semantics and OWL DL, we will concentrate here on those. Also, OWL 2 is backwards compatible to OWL, so we can discuss advantages and new features of OWL 2 elsewhere, and can forget the difference for now and just talk about OWL (and mean both OWL and OWL 2).
Next, we would like to utter a warning: OWL has been designed to be consumed by computers, so in its natural form (especially in certain syntaxes), it is really hard to read or write for humans: e.g., the following snippet of an OWL ontology in the RDF syntax says that
a:Boy owl:equivalentClass _:x .
_:x rdf:type owl:Class .
_:x owl:intersectionOf ( Child Male)
boys are exactly those children who are male. The same example in the Manchester syntax looks more readable,
EquivalentClasses( Boy ObjectIntersectionOf( Child Male ) )
but we can easily imagine a much nicer presentation of this statement, and tool developers have designed useful, goal- or application-oriented tools or visualisations. This is clearly a good thing: it helps the user to interact with an (OWL) ontology, without requiring them to be fluent in the ontology language and while supporting the task at hand.
Now what is in an OWL ontology? There is some stuff like headers and declarations around an ontology but, in essence, an OWL ontology is a set of axioms, and each of these makes a statement that we think is true about our view of the world. An axiom can say something about classes, individuals, and properties. For example, the following axioms (in Manchester syntax) talk about two classes, Man and Person, and one property, hasChild, and two individuals, Alice and Bob.
SubClassOf( Man Person )
SubClassOf(Person (hasChild only Person))
PropertyAssertion(hasChild Bob Alice)
Roughly speaking, these axioms say something about these classes, properties, and individuals, and this meaning is fixed through their semantics, which allows us to distinguish interpretations/structures/worlds/… that satisfy these axioms from those that don’t. For example, a structure where every Man is a Person would satisfy the first axiom, whereas one where we have a Man who is not a Person would not satisfy the first axiom. Rather confusingly for modelers in general, we call those interpretations/structures/worlds/… that satisfy all axioms of an ontology a model of this ontology. It is worth pointing out that one ontology can have many many models, of varying size and even infinite ones. And here we can even have a sneak preview at reasoning or inferencing: assume the axioms in our ontology are such that in all its models, it happens that every GrandParent is a Parent. Then we call this an entailment or a consequence of our ontology, and we expect a reasoner to find this out and let us know (if you are familiar with Protégé, then you might have seen an inferred class hierarchy, which is basically this).
More detailed, this semantics works as follows: first, fix a set — any set of things will do, finite or infinite, as long as it is not empty. Then, take each class name (such as Man) and interpret it as a set — any set is fine, it can even be empty. Then, take each property name (such as hasChild) and interpret it as a relation on your set (basically by drawing edges between your elements) — again, you are free to choose whatever relation you like. Then, take each individual name (such as Bob) and interpret it as one of your elements. All together, you have now an interpretation (but remember that 1 ontology can have many many interpretations). Now, to check whether your interpretation satisfies your ontology, you can go through your ontology axiom by axiom and check whether your interpretation satisfies each axiom. For example, in order for your interpretation to satisfy
- the first axiom, SubClassOf( Man Person ), the set that interprets Man has to be a subset of the set that interprets Person. Since this kind of sentence will soon become horribly contrived, we rather say ‘every instance of Man is also an instance of Person’.
- the second axiom, SubClassOf(Person (hasChild only Person)), every instance of Man is related, via the property hasChild, to instances of Person only. I.e., for an instance of Man, if it has an out-going hasChild edge, then this must link it to an instance of Person.
- the third axiom, ClassAssertion(Bob Man), the element that interprets Bob must be an instance of Man (see, now it becomes quite easy?).
- the fourth axioms, PropertyAssertion(hasChild Bob Alice), the element that interprets Bob must be related, via the hasChild property, to the element that interprets Alice.
So, in this case, we could in principle, construct or invent interpretations and test whether they satisfy our ontology, i.e., whether it’s a model of it or not. This would, however, hardly enable us to say something about what holds in all models in our ontology because, as mentioned earlier, there can be loads of those, even infinitely many…so we rather leave this to tools called reasoners (and they do this in a more clever way). This whole exercise should, however, help us understand the above mentioned entailment. Consider the following two axioms:
EquivalentClass(Parent (Person and isParentOf some Person))
EquivalentClass(GrandParent (Person and (isParentOf some (isParentOf some Person)))
The first axiom says that the instances of Parent are exactly those elements who are related, via isParentOf, to some instance of Person. The second axiom says that the instances of GrandParent are exactly those elements who are related, via isParentOf, to some element who is related, via isParentOf, to an instance of Person. Please note that the GrandParent axiom does not mention Parent. Now you can try to construct an interpretation that satisfies both axioms and where you have an instance of GrandParent that is not a Parent…and it will be impossible…then you can think some more and come to the conclusion that these two axioms entail that every GrandParent is a Parent, i.e., that GrandParent is a sub class of Parent!
Coming back to Protégé: if you look at the inferred class hierarchy in Protege, then you see both the ‘told’ plus these entailed subclass relationships. In OWL, we also have two special classes, thing and nothing, and they are interesting for the following reasons:
- if thing is a subclass of a user-defined class, say X, then every element in every interpretation is always an instance of X. This is often regarded as problematic, e.g., for reuse reasons.
- if your class, say Y, is a subclass of nothing, then Y can never have any instance at all, because nothing is according to the OWL specification, always interpreted as the empty set. In many cases, this thus indicates a modelling error and requires some repair.
Finally, we also ask our reasoner to answer a query, e.g. to give us all instances of Person. If you look again at the four axioms above, then we only have that Bob is an instance of Man, so we might be tempted to not return Bob to this query. On the other hand, we also have the axiom that says that every instance of Man is also an instance of Person, so we should return Bob because our ontology entails that Bob is a Person. Reasoners can be used to answer such queries, and they are not restricted to class names: for example, we could also query for all instances of (Person and (hasChild some Person)). Now, from the four axioms we have, we can’t infer that Bob should be returned to this query because, although we know that Bob is a Person and is hasChild related to Alice, we don’t know anything about her, and thus we don’t know whether she is a Person or not. Hence Bob can’t be returned to this query. Similarly, if we query for all instances of (Person and (hasChild atmost 1)), we cannot expect Bob to be in the answer: although we know that Bob is a Person and is hasChild related to Alice, we don’t know whether he has possibly other children, unbeknownst to us. This kind of behaviour is referred to as OWL’s open world assumption.
It is quite common to distinguish class-level ontologies (which only have axioms about classes, but don’t mention individuals), from instance-level ontologies (i.e., assertions about the types and relations between individuals). We find ontologies that are purely class-level, such as Snomed-CT and NCIt, and where reasoning is used purely to make sure that the things said about classes and the resulting entailed class hierarchy are correct, and that no contradictory things have been said that would lead to subclasses of nothing or to the whole ontology being contradictory. One interesting option is then, e.g., to export the resulting class hierarchy as a SKOS vocabulary to be used for navigation. We also find ontologies with both class- and instance-level axioms, and which are used with the above query answering mechanism for flexible, powerful mechanism for accessing data.
Finally, if you want to use OWL for your application, you will first have to clarify whether this involves a purely class-level ontology, or whether you want to use OWL for accessing data. In the latter case, you have two options: you can leave the data in the database, files, or formats that it currently resides in, and use existing approaches (e.g., using Quonto, OWLGres or Requiem) to map this data to your class-level ontology and thus query it through the OWL ontology. Or you can extract and load it into an instance-level ontology and go from there. Both clearly have advantages and disadvantages, whose discussion goes beyond the scope of this article (as many other aspects).
So, where to go next if you want to learn more about OWL? First, you could download an OWL editor such as Protégé 4, and follow a tutorial on how to build an OWL ontology (see below for more links). You could also read the substantial OWL Primer (it has a cool feature which lets you decide which syntaxes to show and which to hide!) and take it from there. Or you could read some of the papers on experiences with OWL in modelling biology. Regardless of what you do, building your own OWL ontology and asking reasoners to make entailments salient seems always to be a good plan.
- OWL primer
- tutorials: a selection from Manchester
- a list of reasoners and other tools like the OWL editor Protégé 4 and the OWL API
PS: I need to point out that (i) OWL is heavily influence by classical first order predicate logic and by research in description logics (these are fragments of first order logic that have been developed in knowledge representation and reasoning since the late 80ies), and that (ii) OWL is much more than what is mentioned here: e.g., we can annotate axioms and classes, import other ontologies, etc., and in addition to the OWL constructors such as ‘and’, ‘some’, ‘only’, used here, there are numerous others, far too many to be mentioned here.