Meta – Ontogenesis

Semantic Free Identifiers

Phillip Lord — Fri, 01 Jul 2011 13:00:44 +0000

As an ontologist, I probably should have known better and got this right in the first place. I have changed Ontogenesis to use semantic-free identifiers. From initial testing, it appears that WordPress is doing-the-right-thing. The old permalinks redirect to the right place, and internal links have been automatically updated. So, it should work from inside and outside, but please let me know if it is broken anywhere.

The motivation for this came from two sources. First, some one suggested that I change the name of my article; the suggestion was a good suggestion, but I started to worry that it would make my permalink out-of-date. Second, Duncan Hull decided to shorten the permalink for his article. Perfectly sensible but, again, this caused some consternation as it broke the implied semantics.

Bottom line, here, is that semantics free identifiers seemed like a sensible way to go. The flip side is, of course, that they are a little harder to remember, and there is more chance of error when inserting them in text, because you can’t tell whether it’s the right link or not. However, this is also true if articles change during review so that the implicit semantics in the link are wrong.

There isn’t really a way to square this circle. Comments welcome.

How to Review for Ontogenesis

Phillip Lord — Mon, 30 May 2011 16:33:10 +0000

The Ontogenesis kblog is an open environment; this includes the reviews too. So, the author will know who reviewed the article; indeed, it is most likely that the authors selected the reviewers. It is our hope and experience that the open nature of the review process contributes positively to the process; reviewers gain credit for their contribution, while at the same time, ensuring that the process is self-policing.

We ask that reviewers apply the same reviewing standards that we would all like to have for our own articles; they should provide significant, positive and constructive feedback about the article. Articles for ontogenesis kblogs are aimed at an audience that is attempting to learn about ontology modelling, authoring and using ontologies. They are supposed to be short, accessible articles; while they should be correct, they do not need to be fully detailed. If more needs to be written, this may indicate another article is needed — we welcome contributions the ideas for which have come from the reviewing process.

Reviewing an article follows a similar process to writing an article in the first place, and should present no technical challenges for those who have already written a kblog article. A more complete description of the technical process is available on Process.

Authors

Phillip Lord
School of Computing Science
Newcastle University
United Kingdom
NE3 4PH

phillip.lord@newcastle.ac.uk

Robert Stevens
School of Computer Science
University of Manchester
United Kingdom

mailto:robert.stevens@manchester.ac.uk

Ontogenesis: One Year On

Phillip Lord — Tue, 05 Apr 2011 17:23:06 +0000

Abstract

The Ontogenesis kblog has now been available for a little over a year. This offers us a good time to sit back and look upon what has been achieved, and how we can improve things for the future.

For the reader

The original motivation for Ontogenesis came from our long-held desire for a book on ontology development, aimed at a level appropriate for incomers into the field. As a discipline, we have made some great strides in the uptake of ontological technology, but to those coming afresh to the field it is often difficult to know where to start.

However, book publishing tends to be a long-winded and difficult process; we wanted something lighter, faster, responsive and easier. Our experiences with blogging technology suggested this might be a good way forward. At the same time, we wanted Ontogenesis to have an appropriate degree of academic credibility; we didn’t want a collection of opinion pieces or unformed ramblings that define much of the blogosphere. We wanted to see if we could use a blog engine to replace a process akin to the existing publishing process.

The Ontogenesis kblog now offers a collection of small accessible articles on aspects of knowledge and semantics, primarily ontologies, within biology/bioinformatics. At the end of the first year we have over twenty articles available in the Ontogenesis k-blog. There is an mean “reads per month” of 1,000, which makes for a total number of accesses of over 14k, with a peak of 1,250 reads for Jan 2011. The most accessed article, perhaps unsurprisingly, is What is an Ontology.

Even given that this is after robot removal, this is clearly not a statistic to set the publishing world alight; however, it is of a similar size to the number of page reads that might be expected for an “average” PLoS one article, and it is also reasonable when compared to the print run for an average academic book. In short, we feel that these statistics are rather good.

For the reviewers

Whilst content supply has been fairly good, the gathering of reviews by authors has not been so good. In the Ontogenesis k-blog, authors are supposed to manage the review process themselves. Most authors, including ourselves, just stop after publishing an article. this may be for several reasons:

The first and we think the most important reason relates to an issue identified early by Sean Bechhofer: are we a blog or a wiki? In this web-based environment, authors are more used to a collaborative style of writing. For instance, when I (PL) started an article which involved a description of the semantics of OWL; as I am not an expert, I asked Uli Sattler to be a co-author, partly to check my initial statements. This has improved the quality of the article but, of course, she was now no longer suitable as a reviewer.

A second, more social reason, is that authors like to write and publish; once that is achieved the drive to gain reviews is lost. If publication on the kblog didn’t happen until there were reviews, the incentive would change. This would, however, interfere with the lightness of the process. Of course, authors want comment on their articles and kblogs, like any blog, have such a mechanism via the standard commenting mechanisms. We will also have to look at incentives for getting authors to gather reviews; even simply emailing people to “come, look and comment”.

A third, and critical reason, is that the system we had for peer-review was hard to use. We replicated a complete “two reviewers and editor” style system. Authors, reviewers and editors had to remember where they were. This was just too difficult.

For Ontogenesis, we have addressed this in two ways; we have moved to a single, author directed reviewing system. Authors choose their own reviewer, and decide when they have addressed the comments; as all the information is public, we consider that this should be sufficient. Secondly, we have provided some software support within WordPress to help manage the process.

There are many forms of reviewing besides the standard peer-review; Wikipedia has system, while H2G2 has another. Perhaps, in the end, we need to think less of authors and reviewers, and more of primary and secondary authoring. It is not clear to us what the future holds for the peer-review system in science, but we hope that the kblog as a platform is a suitable place to experiment.

For the authors

Our original intention was to provide a resource that was equivalent to a book. However, based around blogging software as it is, Ontogenesis allows us to take advantage of a rapid publication framework; this, in turn, has changed the form of articles that we are writing. While, we still have “long-form” articles such on topics such as health informatics, as authors we have started to write shorter articles on small-discrete topics such as closure axioms. This form of article is, we hope, useful; that James Malone and Helen Parkinson’s article distinguishing reference and application ontologies has had 500+ page reads suggests that they are. They are also enjoyable to write.

We have also seen the advantages over a wiki style environment. Authors can provide their own ideas within an article, as well as adopt their own style. Although it is a small advantage, that an article can contain mild humour, in the end, produces a more readable resource. Our experience with Wikipedia suggests that this form of personal approach is rapidly edited away. Of course, the brutal reality is the main advantage is that the work remains attached to the individual authors; whether or not we like it, academics need to self-promote and cannot afford to do work without credit.

For the future

At the start of the Ontogenesis k-blog, we used a vanilla WordPress installation. This worked, but needed some extras to make it suitable for scientific publishing. We have been fortunate enough to recieve funding from JISC for kblog. Work on this grant is now well underway; we now improved how-to documentation, we have a more refined process, and several pieces of software for presentation of maths, for citations and to enable searching, sharing and storing. Ontogenesis is now archived by the British Library, articles have DOIs, and we have been included in Google Scholar. We are hoping for inclusion in PubMed. In short, we are filling the gaps between Ontogenesis and a formal academic book rapidly, without compromising on the original vision of a rapid, and easy-to-use publication framework.

We feel that this form of publication has a strong future ahead of it. The experience of getting this content onto the web has been enjoyable, and informative. We hope that the resource is valuable to others and that authors continue to find a desire to contribute.

Considering the Process

Phillip Lord — Mon, 01 Mar 2010 16:10:06 +0000

Introduction

In this post, I want to address some of the issues raised by Sean Bechoffer in his thoughtful reflection. One of my aims with thinking about the knowledgeblog process was to come up with a form of publishing which is focused on the reader, reviewer and author, rather than the publisher. Clearly, Sean found it lacking in this respect and the process needs improving as a result.

Technology

I wanted to address what I think is the most answerable comment first.

As an aside, I found the WordPress UI a nightmare to work with. I don’t really like browser based editors, so prefer to author text using a text editor, and then cut’n’paste into the web tool. Once I’d pasted text into the text box, however, I found it messed with my underling HTML markup (adding lots of
elements and stripping all my

’s out). Tables seemed problematic too, although that may be something to do with the underlying style.

The short answer here is, yes, I agree. The management side of the WordPress UI is, I think, generally, okay and reasonably easy to use. But the post editor is designed for people who want to write short, unstructured notes. I’ll talk more about Wiki’s later on, but I think that they have a similar problem.

I was quite keen with knowledgeblogging to separate out the process of editing from the process of publishing. Science is, I think, just too complex for a one-size-fits-all approach. People want to achieve different things and no tool stands out as being ideal. For collaborative editing, for example, Google Docs works really well. For structuring authoring, latex is excellent; with added maths, it is pretty much essential. Or consider my own technique, which uses asciidoc — some of my posts involve incorporating latex, Manchester syntax OWL and python; by using asciidoc, I can get it all syntax-highlighted, and can even incorporate directly from source which I can run. None of these solutions is ideal; authors need to be able to pick the one that benefits them the most.

HTML is a simple enough common-denominator which many tools can generate, and wordpress can (mostly) display. Of course, by allowing best-of-breed software, the knowledgeblog has problems; it’s never going to be as easy as an integrated solution. We need better documentation, enabling the use of these technologies.

I think that the solution here is to have a place to lodge articles describing different mechanisms for posting on knowledgeblog.org. My solution here is to use the knowledgeblog process; I plan to start “process.knowledgeblog.org”, describing how to use various aspects of the process, including different technologies and tools for knowledgeblogging.

Style of Content

Sean asks at several points about the style of articles that we were generating and whether the process was ideally suited to it.

1. Writing a number of short “encyclopedia style” articles relating to
   ontologies (and their use in bioinformatics).

2. Investigating new models for the publication process, in particular the use
   of a blog in order to manage the review process.

The rationale for A is clear, for B, the intention is to try and reduce some
of the overhead and time delay that can be present when using traditional
publishing routes. However, in my final analysis I think the difference
between the kinds of short article for an encyclopedia and longer scientific
papers means that the process hampered us somewhat in the production of our
initial articles

And later:

The reality is that what we were trying to do falls somewhere
between what’s offered by a blog and a wiki. A wiki may well have been a
better environment for supporting the collaborative writing and commenting
process for the encyclopedia.

Again, I think I largely agree with this. The process was designed to mimic the existing process of author/review/accept with some additional value coming from the blog software — referenced articles automatically get backlinked for example. The process normally happens asynchronously; different people do things at different times. There two key changes from the existing process are, I think, both good: first, review is public with the reviews forming part of the scientific record; second, there are no artificial deadlines coming from the publication process, so new articles can be published as they are ready.

To kick-start this, however, we needed content. Ideally, authors and reviewers would have generated content without a meeting; however, in practice, I thought this was less likely to happen; the number of articles which have been completed since the meeting seems to bear this out. The co-located meeting, however, did not ideally suit the process, as Sean suggests:

The fact that we were all co-located also meant that I wanted/expected quick
feedback — ”shouts across the room”.

I do not offer any solution to square this circle; the meeting was successful in generating content, but, by its nature, the process was not entirely suited to the meeting.

Wiki or Blog

Why aren't we doing this through a wiki?

Again, also a good question. In the environment of a meeting in a room then a wiki might well have been a better and an alternative solution for hosting ontogenesis. However, in the end, I don’t think that this is the right approach.

Firstly, the big issue is one of credit. It’s an issue of critical importance; scientists need credit for their work; we have to see our name attached to our words or, in time, we will be out of business. Now I may have opinions on whether this is a good thing or a bad thing, but it’s not really relevant; it is part of the way that the world is and we need the process to fit to this reality, not the other way around. A blog provides this; even with a review process, it is the authors post and they retain the credit. Or the blame; which leads to the second issue.

Wikipedia is a good example of what you can do with a wiki; like most academics (and everyone else!), I find it to be a tremendous resource and use it regularly. One thing, however, it is fairly poor at is reflecting differing opinions; as a minor example take, for instance, this post on LSIDs. This article spends more time describing what is wrong with LSIDs than describing what they are. In many ways two articles would be better, reflecting the opinions of those involved; consider the more extreme example of the global warming articles in wikipedia and conservapedia. Wikis designed for multi-author, collaborative generation of content; blogs are designed for single (or a few) author content, with interactions between the different authors. They seem a better fit.

Of course, time will tell. For instance, Ontology Design Patterns, appears to be implementing a peer-review and evaluation process using a wiki; they’ve used a similar approach, except with semantics in URLs rather than in categories differentiating between article and reviews. But, then how much are they making use of collaborative features of the wiki?

Versioning

The final issue to address is the thorny one of versioning, as Sean says:

It becomes a new article once the semantics are largely enough to make it new;
don't think that there is a hard or fast line here. But, ultimately, I think
we need to address this with open versioning.

and later:

Versioning. What is the versioning strategy that is used? Are articles
edited “in place”, or should edits result in a new article? In which case,
should all edits result in new articles, or can we fix typos? Who then
decides what edits are “acceptable”?

My answer here, is that once accepted, it should be possible to update an article only for technical reasons; English corrections, small errors or for updating metadata (“this article has been outdated”). Now, when does an article become a new article; in the absolute sense, I think that this is a hard question, but my answer in this more specific case is, when it has changed enough. This sort of editorial policy is one that needs to develop over time, based on specific examples. I think, however, we do need better support to ensure integrity of the scientific record. For this purpose, I think we need to extend wordpress; the version history of each post is available in the database, and I think that it needs to be uncovered, for public consumption, or at least all versions since the article was made public. Of course, the current articles were not written with this in mind, but I think that future articles should be.

Conclusions

Some strong criticisms were raised about the process of knowledgeblogging; I think that these are mostly valid, but I think are addressable with some changes to the process, some extensions to wordpress and, above all, better communication with the authors about what is required.

Despite the short-comings of the process, in one short meeting, we did generate substantial content and this, in turn, has generated significant interest. In the last month (February) alone, Ontogenesis has had around 1000 page views, which is significant compared to the average scientific book. Of course, this success is mostly the success of the content and not the process; if the process is not tailored further to the needs of the authors, they will not continue to contribute.

The process has fulfilled one key need; it has provided publicity and an audience in a timely manner. For this reason, I think that our initial experiment with knowledgeblogging has been a success — limited, guarded and in need of improvement, but a success none the less. Hopefully, we can build on this for the future.

Back to Books: Researchers should be recognized for writing books to convey and develop science

Duncan Hull — Thu, 04 Feb 2010 10:13:32 +0000

There is an interesting editorial on books [1] today in Nature, related to Ontogenesis and books.

“Back to books: Researchers should be recognized for writing books to convey and develop science.”

References

Nature, Vol. 463, No. 7281. (03 February 2010), pp. 588-588. DOI:10.1038/463588a

Registration Required

Phillip Lord — Mon, 01 Feb 2010 12:05:31 +0000

Registration is now required for commenting due to the inevitable spam which has followed the first meeting. Unfortunately, this is not a “personal” blog, so I can’t use akismet without a license key. At the moment, the cost of this is prohibitive.

Unfortunate, but not surprising. Pingbacks still work, so commenting without registration can still happen this way.

Stats now available

Phillip Lord — Wed, 27 Jan 2010 15:48:54 +0000

I’ve installed the stats plugin now, so that we can trace the posts and pages that are being widely viewed. The graph was initiallly broken after I made the mistake of following the installation instructions; it should be working now.

The hit count is still at around 3000 hits per day, which works out at around 150 article views a day.

Ontogenesis hits the Blogosphere

Phillip Lord — Mon, 25 Jan 2010 19:13:15 +0000

I’m please to note that Ontogenesis and its articles have hit the blogosphere already. One post the day after the meeting or one day after the first, peer-reviewed post went life. The second was from Doug Kell of BBSRC. A nice demonstration of the speed of this form of scientific publication.

Apologies to those who noticed the outage from the server this afternoon. Naturally, I chose the period of initial internet exposure to fiddle with the server and, so, took apache down for the duration. Possibly not the greatest decision ever; sometimes I amaze even myself. There will be a second outage later in the week for further work.

Reflections on Blogging a Book

Sean Bechhofer — Mon, 25 Jan 2010 10:49:10 +0000

We’ve just had an interesting couple of days at the Ontogenesis Blogging a Book Meeting. I found myself adopting the position of naysayer on a few occasions (that’s usually Phil’s job, but he was running the meeting), raising questions about the process and technologies being applied. This post is an attempt to reflect on the meeting and try and identify why I was uncomfortable with the exercise, and what one might do to address the situation.

I should first make it clear that I agree with the overall aims of the exercise (see below). In addition, although this commentary is perhaps negative in a number of ways, it is intended to be constructive criticism. This was a very interesting and thought-provoking couple of days. Many thanks to Robert Stevens, George Moulton and Phil Lord for organising the meeting and providing the initial stimulus. Now, out with the knives!

Note that the opinions expressed here are mine and may not represent the views of others involved in either the ontogenesis network or this particular meeting. This may also be a slightly half-baked rendering of my thoughts and may be subject to review!

Are we nearly there yet?

What were we trying to achieve? I think the meeting had two purposes.

Writing a number of short “encyclopedia style” articles relating to ontologies (and their use in bioinformatics).
Investigating new models for the publication process, in particular the use of a blog in order to manage the review process.

The former was to be realised through the latter. The rationale for A is clear, for B, the intention is to try and reduce some of the overhead and time delay that can be present when using traditional publishing routes. However, in my final analysis I think the difference between the kinds of short article for an encyclopedia and longer scientific papers means that the process hampered us somewhat in the production of our initial articles (“why didn’t we just use a wiki”). This is not to say that the blogging approach is not appropriate as a mechanism to support the publication process (in fact I think it might work fine if tweaked), but the jury’s still out.

Process

The process for the meeting was roughly as follows. A number of topics for entries were identified. People selected topics that they wanted to write an entry for (in some cases this may involve multiple authors). Entries were then written as a blog entry. Once the author considered the entry ready for review, it was tagged appropriately. Reviews were also written as blog post. Through the use of the WordPress’s trackback (or was it pingback?) mechanism, by including a link to the original post in the review, the review appears as a comment in the original post.

Categories were used to indicate the status of articles (under review, reviewed, peer review), while tagging indicated content.

The idea is that through the use of the commenting mechanism, we can preserve a trail of comments and reviews, which not only provide an insight into the evolution of the article, but also provide some attribution and credit for the work of the reviewers. Note that this assumes that the process is open, with reviewer’s comments and identity visible. This is a great idea, but the meeting showed that there are some issues with the actual delivery of this using the technology.

Short or Long?

The meeting was focused on writing short, collaborative articles with a quick turnaround (we hoped to produce a number of initial articles by the end of the meeting). However, such writing is very different from the lengthy scholarly articles that one might expect to find in a journal. Encyclopedia articles tend to be (or at least should be, imo) objective and factual rather than opinion or subjective interpretation. The kinds of review required for the activities are different. For a short article with rapid turnaround, I would like to get quick feedback about whether there are significant pieces missing, or whether content is on- or off-topic. The fact that we were all co-located also meant that I wanted/expected quick feedback — ”shouts across the room”. A review of a full paper would be lengthier and in-depth, and could also required more detailed referencing of specific sections of the original article. I’d also be happy to wait longer.

Wiki or Blog?

A number of times during the meeting, the question “why aren’t we doing this through a ~~blog~~wiki?” was asked. Valid question, and one which I’m not sure there really was a good answer for. The reality is that what we were trying to do falls somewhere between what’s offered by a blog and a wiki. A wiki may well have been a better environment for supporting the collaborative writing and commenting process for the encyclopedia.

Communication

Authors and reviewers needed to communicate. For example, the process as defined required authors to request reviews for articles. This was done “out of band” (e.g. without using the mechanisms of the blog). In practice, this was done by email, or simply by direct communication (as all the participants were actually co-located for the two days).

There were occasionally suggestions that reviewers/authors could communicate directly, e.g. in order to exchange information on minor typographical errors. In this case, distinguishing what kinds of communication should occur in band and out of band is important, particularly if records of the communication are intended to be part of the process — at what point do grammatical changes become substantial changes to the intention of an article?

Technologies

‘s out). Tables seemed problematic too, although that may be something to do with the underlying style.

Conclusions

In order to prevent this from just being a pointless rant, I would like to conclude with some suggestions/observations. This was a useful activity, but I would probably approach it in a different way if I was to repeat it. Below are a number of questions/points that I think need further investigation. Some of these are technology related (e.g. WordPress doesn’t do what’s needed), some are more about identifying the process and requirements.

A clearer identification of the process. What are the different steps/phases that an article will go through? This is particularly important, as I think the processes involved in writing the encyclopedia are different to those that would be in place for “journal style” reviewing.
Versioning. What is the versioning strategy that is used? Are articles edited “in place”, or should edits result in a new article? In which case, should all edits result in new articles, or can we fix typos? Who then decides what edits are “acceptable”?
What’s the role of the editor (if any)? Do we need a central controller?
Communication between authors/reviewers/editors. Mechanisms are needed that allow communication between the various actors. What is in-band and out-of-band? How much did/does the physical co-location of the participants impact on the process?
The ability to deal with different kinds of information in a review. There can be comments about the presentational aspects of the work (e.g. typos, grammatical errors and so on), as well as more substantive comments relating to the content of the work.
A clearer identification of the activities/tasks that actually required interaction and communication, and how those are managed. A number of things (soliciting reviews for example) were done “in person”. Clearly this would not be possible if participants hadn’t been co-located (although email would also work).

Some of the points above (e.g. 1 and 2) need to be considered independently of the technology being used to deliver them (although it is important to bear in mind what is possible/feasible). There was an occasional tendency to bend what we were trying to do to fit with the WordPress functionality (e.g. single categories).

First Post

Phillip Lord — Sat, 23 Jan 2010 14:13:47 +0000

The Ontogenesis knowledgeblog has achieved first post from authoring and through peer review. Congratulations to Allyson Lister for being the first.