James Malone – Ontogenesis

Review of Reference and Application Ontologies

Mikel Egana Aranguren — Fri, 22 Jan 2010 13:24:08 +0000

This is a review of the paper Reference and Application Ontologies.

This paper describes the difference between reference and application ontologies, especially w.r.t. the motivation for using application ontologies.

I would add an abstract that clearly states the difference between reference ontology and application ontology and merge introduction with background, as the current introduction does not mention reference ontology, and it jumps direclty to background.

At the end of background, it seems like the motivation for application ontologies is some defficiencies on reference ontologies, but it is not that clear.

It is not clear whether the division of reference/domain ontology is the consequence of technical limitations or a necessary conceptual decision. That is, if we solved the technical problems on reference ontologies (i.e. interoperability, use of the same Upper Level Ontology, etc.) would we still need domain ontologies or not? I think the authors should ellaborate on this point.

It would be helpful to have a more thorough example on paragraph 1 of section “Motivation … “, with concrete ontologies.

I would add CCO as an example of domain ontology. The building of CCO, as (I believe) other domain ontologies, required a considerable technical effort, since domain ontologies collect and enrich information form other ontologies/resources. I would like to see a more in depth discussion of technical problems when building domain ontologies (importing, ids, solving semantic missmatchings, use of reasoners, etc.)

This paper should be accepted.

Community Driven Ontology Development

Frank Gibson — Fri, 22 Jan 2010 10:49:07 +0000

Frank Gibson and James Malone^§

^§European Bioinformatics Institute, Cambridge, CB5 8LW, UK

Community driven ontology development is the process of collaboratively building an ontology which represent the understanding of a particular community or domain area. Within the biological domain, collaboration and community involvement is common place. As an ontology can be interpreted as a “shared understanding” of a particular domain, collaboration and community involvement should be maximised within the life-cycle of an ontology.

Background
Integrating biological knowledge within an ontological framework produces what is referred to as a bio-ontology, where a shared understanding of biology is represented in a computationally amenable form. The study of the biological sciences is a global effort of researchers and institutions each specialising in, or across particular niches to further our understanding of biology. As the experts in a particular biological field are rarely physically co-located Bio-ontology development as a result is highly distributed, forming what can be thought of as virtual organisations in which experts with different but complementary skills collaborate in building an ontology for a specific purpose.

Typically bio-ontology development is dynamic where different domain experts join and leave the network at any time and decide on the scope of their contribution to the joint effort. In addition, biological ontologies continue to evolve, even after the initial development drive. The continued evolution reflects the advancement of scientific knowledge discovery. New classes, properties, and instances may be added at any time, and new uses or extended scope for the ontology may be identified .

The diversity of the life-science domain results in a multitude of application domains for ontology development and therefore produces numerous ontologies for biology. However, with the diversity, there is equal homology. Typically, the same experimental equipment and reagents can be used to study different aspects of biology. For example, a mass spectrometer can be used to determine the elemental composition of a molecule in a chemistry based experiment, and to determine the chemical structure of peptides in a proteomics investigation. The multiple application of equipment, reagents and organisms in the study of biology and the potential to be represented multiple times across different bio-ontologies, with slightly different defintions. This potential proliferation or duplication of terms, could undermine the ethos of ontology development – to produce a shared understanding.

Examples

OBI
The Ontology of Biomedical Investigations (OBI) aims to produce an ontology which represents the common components of life-science experimentation, such as equipment, materials and protocols. The developer community of OBI is currently affiliated with 18 diverse biomedical communities, ranging from functional genomics to crop science to neuroscience. In addition to having a diverse community of expertise, the OBI developers work in a virtual organisation encompassing multiple countries and time zones.

The OBO Foundry
In an attempt to address the issue of bio-ontology proliferation and potential overlap the The Open Biomedical Ontology (OBO) Foundry was formed (Smith et al, 2007). The OBO Foundry describes itself as “a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain.” The role of the OBO Foundry is twofold. In one role The OBO Foundry acts as a registry to collect public domain ontologies that, by design and revision, are developed by and available to the biomedical community, fostering information sharing and data interpretation. This section is called the OBO library. (As of 23rd of January 2010 there are 111 candidate ontologies at the OBO Foundry, representing knowledge domains ranging from Amphibian gross anatomy, infectious diseases to scientific experimentation. In addition to providing a library of bio-ontologies, the OBO Foundry was formed to reduce ontology overlap and ensure bio-ontology orthogonality. Initial steps at achieving this aim have produced a set of design {http://www.obofoundry.org/crit.shtml} to which domain ontologies should adhere, such as, openness, a shared syntax and class definitions. The OBO Foundry proposes a review process by which ontologies may become ‘certified’ as meeting the OBO Foundry criteria (e.g. orthognality, shared namespace. At present no OBO Foundry ontology has been awarded this certified status, although several have been proposed as “Candidate ontologies” which are ready to be reviewed. Criticims of the OBO Foundry centre around the lack of suggested community orientated engineering methodology , method or technique by which these principles can be met (see Hull and Gibson).

Authors: Frank Gibson and James Malone

Acknowledgements

This paper is an open access work distributed under the terms of the Creative Commons Attribution License 2.5 (http://creativecommons.org/licenses/by/2.5/), which permits unrestricted use, distribution, and reproduction in any medium, provided that the original author and source are attributed.

The paper and its publication environment form part of the work of the Ontogenesis Network, supported by EPSRC grant EP/E021352/1.

Reference and Application Ontologies

James Malone — Fri, 22 Jan 2010 10:19:30 +0000

James Malone and Helen Parkinson

European Bioinformatics Institute, Cambridge, CB10 1SD, UK

Introduction

An application ontology is an ontology engineered for a specific use or application focus and whose scope is specified through testable use cases. The application ontology will often use or reference canonical ontologies to construct ontological classes and relationships between classes. Application ontologies are used when modeling cross-domain experiments in biology, for data annotation or visualization and for producing data driven views across reference ontologies for specific user groups.

Author Profiles

Helen Parkinson is a geneticist who was seduced to the dark side (Bioinformatics) 10 years ago. She manages and annotates high throughput functional genomics data for the ArrayExpress database and Atlas of Gene Expression hosted at The European Bioinformatics Institute. She also builds ontologies such as EFO and OBI to annotate these data.

James Malone is a knowledge engineer and computer scientist who builds ontologies and triple stores at the EBI. He is a Newcastle United supporter and therefore often disappointed.

Background

There are many reference or ‘canonical’ ontologies in biomedicine. Organizations such as the OBO Foundry aim to organise these reference ontologies into a collection of non-overlapping or ‘orthogonal’ and interoperable resources. There are challenges in integrating, building and consuming reference ontologies. Current reference ontologies are not fully interoperable as they are constructed in different styles, using different tools and often do not share a common upper level ontololgy.
Consequently the import of all or part of most reference ontologies into a single resource is not practical or feasible. Furthermore, importing and combining large ontologies like FMA produces very large ontologies which cause scaling problems when performing reasoning using description logics. There is also an issue of coverage; reference ontologies do not necessarily contain sufficient combinations of classes (e.g. intersections or unions) to represent experimental data. For example information about a cell line includes a cell type and tissue from which it derives, and information about the individual from which tissue was obtained.

Motivation for developing Application Ontologies

Application ontologies are typical used when crossing domains e.g. transcriptomics and genomics, or combining annotation on the sample, gene and experiment dimensions. Let’s consider a gene expression use case: we’d like to make statements about experimental processes, assays, cell types, cell lines, diseases and chemical compounds used to treat cell lines which are experimental models for disease. Performing queries using all these concepts requires that reference ontologies are fully integrated. An application ontology resolves these issues by importing all or parts of reference ontologies that are required to support the application use cases and by integrating along a common axis. The common axis may be an upper level ontology or via a structure that best represents the needs of the application e.g. driven by the data.

Application ontologies can also offer alternative ‘views’ on the reference ontologies by producing specific user or domain-oriented definitions for ontology classes. This may involve producing a definition that a particular community will relate to (given the application focus) (e.g. ‘normalization’ may have several meanings depending upon the context and application focus) or rendering class labels for a specific user community.

An application ontology should be evaluated against a set of use cases and competenecy questions which represent the scope and requirements of the particular application. For example, a user query use case may contain the competency question ‘what cancer cell line data is there’. This requires sufficient ontological coverage to capture the concept of ‘cancer cell line’.

Examples of Application Ontologies

EFO

The EBI’s Experimental Factor Ontology is used to represent sample variables from gene expression experimental data. EFO imports classes from multiple reference ontologies and produces new classes which add additional knowledge to reference ontology classes in order to meet querying and curation use cases.

NIFSTD

The NeuroInformatics Framework – NIF (NIF), formerly known as BIRN, have produced the NIFSTD ontology. NIF is ‘A dynamic inventory of Web-based neuroscience resources: data, materials, and tools accessible via any computer connected to the Internet’. NIF has two application resources:

1. NIFSTD an ontology with separate modules covering major domains of neuroscience: anatomy, cell, subcellular, molecule, function and dysfunction.

2. NeuroLex has detailed concepts for describing experimental techniques and instruments typically employed to carry out neuroscientific studies, as well as concepts for describing digital resources being created throughout the neuroscience community.

Both NIFSTD and NeuroLex are non-orthogonal to OBO foundry ontologies and contain cross references to e.g. FMA terms, adding local terms when needed.

Conclusion

Application ontologies are used to meet specific use cases and consume reference ontologies. They have some drawbacks which must be managed if they are to be used successfully.

1. Scaling can be an issue, terms need to be imported and ontologies can become large quickly.

2. Ontologies change rapidly, therefore importing classes without checking if these are still current can mean inbuilt obsolesence. Agent technology can be used to manage this.

Acknowledgements

This paper is an open access work distributed under the terms of the Creative Commons Attribution License 2.5 (http://creativecommons.org/licenses/by/2.5/), which permits unrestricted use, distribution, and reproduction in any medium, provided that the original author and source are attributed.

The paper and its publication environment form part of the work of the Ontogenesis Network, supported by EPSRC grant EP/E021352/1.

Ontogenesis: Who’s here?

Duncan Hull — Fri, 22 Jan 2010 09:04:10 +0000

Who’s here? The following is an alphabetical list of people currently attending the Ontogenesis Blogging a Book Experiment.

Sean Bechhofer, University of Manchester
Michel Dumontier, University of Carleton
Mikel Egana-Aranguren
Frank Gibson
Matthew Horridge, University of Manchester
Duncan Hull, EBI
Simon Jupp, University of Manchester
Allyson Lister, Newcastle University
Phillip Lord, Newcastle University
James Malone, EBI
David Osumi-Sutherland, University of Cambridge
Helen Parkinson, EBI
Robert Stevens, University of Manchester
Christopher Brewster, Aston Business School
Alan Rector, University of Manchester
Ulrike Sattler, University of Manchester
David Shotton, University of Oxford