on March 1, 2010 by Phillip Lord in Meta, Comments (1)

Considering the Process



Introduction

In this post, I want to address some of the issues raised by Sean Bechoffer in his thoughtful reflection. One of my aims with thinking about the knowledgeblog process was to come up with a form of publishing which is focused on the reader, reviewer and author, rather than the publisher. Clearly, Sean found it lacking in this respect and the process needs improving as a result.


Technology

I wanted to address what I think is the most answerable comment first.

As an aside, I found the WordPress UI a nightmare to work with. I don’t really
like browser based editors, so prefer to author text using a text editor, and
then cut’n’paste into the web tool. Once I’d pasted text into the text box,
however, I found it messed with my underling HTML markup (adding lots of <br/>
elements and stripping all my <p/>’s out). Tables seemed problematic too,
although that may be something to do with the underlying style.

The short answer here is, yes, I agree. The management side of the WordPress UI is, I think, generally, okay and reasonably easy to use. But the post editor is designed for people who want to write short, unstructured notes. I’ll talk more about Wiki’s later on, but I think that they have a similar problem.

I was quite keen with knowledgeblogging to separate out the process of editing from the process of publishing. Science is, I think, just too complex for a one-size-fits-all approach. People want to achieve different things and no tool stands out as being ideal. For collaborative editing, for example, Google Docs works really well. For structuring authoring, latex is excellent; with added maths, it is pretty much essential. Or consider my own technique, which uses asciidoc — some of my posts involve incorporating latex, Manchester syntax OWL and python; by using asciidoc, I can get it all syntax-highlighted, and can even incorporate directly from source which I can run. None of these solutions is ideal; authors need to be able to pick the one that benefits them the most.

HTML is a simple enough common-denominator which many tools can generate, and wordpress can (mostly) display. Of course, by allowing best-of-breed software, the knowledgeblog has problems; it’s never going to be as easy as an integrated solution. We need better documentation, enabling the use of these technologies.

I think that the solution here is to have a place to lodge articles describing different mechanisms for posting on knowledgeblog.org. My solution here is to use the knowledgeblog process; I plan to start “process.knowledgeblog.org”, describing how to use various aspects of the process, including different technologies and tools for knowledgeblogging.


Style of Content

Sean asks at several points about the style of articles that we were generating and whether the process was ideally suited to it.

1. Writing a number of short “encyclopedia style” articles relating to
   ontologies (and their use in bioinformatics).

2. Investigating new models for the publication process, in particular the use
   of a blog in order to manage the review process.

The rationale for A is clear, for B, the intention is to try and reduce some
of the overhead and time delay that can be present when using traditional
publishing routes. However, in my final analysis I think the difference
between the kinds of short article for an encyclopedia and longer scientific
papers means that the process hampered us somewhat in the production of our
initial articles

And later:

The reality is that what we were trying to do falls somewhere
between what’s offered by a blog and a wiki. A wiki may well have been a
better environment for supporting the collaborative writing and commenting
process for the encyclopedia.

Again, I think I largely agree with this. The process was designed to mimic the existing process of author/review/accept with some additional value coming from the blog software — referenced articles automatically get backlinked for example. The process normally happens asynchronously; different people do things at different times. There two key changes from the existing process are, I think, both good: first, review is public with the reviews forming part of the scientific record; second, there are no artificial deadlines coming from the publication process, so new articles can be published as they are ready.

To kick-start this, however, we needed content. Ideally, authors and reviewers would have generated content without a meeting; however, in practice, I thought this was less likely to happen; the number of articles which have been completed since the meeting seems to bear this out. The co-located meeting, however, did not ideally suit the process, as Sean suggests:

The fact that we were all co-located also meant that I wanted/expected quick
feedback — ”shouts across the room”.

I do not offer any solution to square this circle; the meeting was successful in generating content, but, by its nature, the process was not entirely suited to the meeting.


Wiki or Blog

Why aren't we doing this through a wiki?

Again, also a good question. In the environment of a meeting in a room then a wiki might well have been a better and an alternative solution for hosting ontogenesis. However, in the end, I don’t think that this is the right approach.

Firstly, the big issue is one of credit. It’s an issue of critical importance; scientists need credit for their work; we have to see our name attached to our words or, in time, we will be out of business. Now I may have opinions on whether this is a good thing or a bad thing, but it’s not really relevant; it is part of the way that the world is and we need the process to fit to this reality, not the other way around. A blog provides this; even with a review process, it is the authors post and they retain the credit. Or the blame; which leads to the second issue.

Wikipedia is a good example of what you can do with a wiki; like most academics (and everyone else!), I find it to be a tremendous resource and use it regularly. One thing, however, it is fairly poor at is reflecting differing opinions; as a minor example take, for instance, this post on LSIDs. This article spends more time describing what is wrong with LSIDs than describing what they are. In many ways two articles would be better, reflecting the opinions of those involved; consider the more extreme example of the global warming articles in wikipedia and conservapedia. Wikis designed for multi-author, collaborative generation of content; blogs are designed for single (or a few) author content, with interactions between the different authors. They seem a better fit.

Of course, time will tell. For instance, Ontology Design Patterns, appears to be implementing a peer-review and evaluation process using a wiki; they’ve used a similar approach, except with semantics in URLs rather than in categories differentiating between article and reviews. But, then how much are they making use of collaborative features of the wiki?


Versioning

The final issue to address is the thorny one of versioning, as Sean says:

It becomes a new article once the semantics are largely enough to make it new;
don't think that there is a hard or fast line here. But, ultimately, I think
we need to address this with open versioning.

and later:

Versioning. What is the versioning strategy that is used? Are articles
edited “in place”, or should edits result in a new article? In which case,
should all edits result in new articles, or can we fix typos? Who then
decides what edits are “acceptable”?

My answer here, is that once accepted, it should be possible to update an article only for technical reasons; English corrections, small errors or for updating metadata (“this article has been outdated”). Now, when does an article become a new article; in the absolute sense, I think that this is a hard question, but my answer in this more specific case is, when it has changed enough. This sort of editorial policy is one that needs to develop over time, based on specific examples. I think, however, we do need better support to ensure integrity of the scientific record. For this purpose, I think we need to extend wordpress; the version history of each post is available in the database, and I think that it needs to be uncovered, for public consumption, or at least all versions since the article was made public. Of course, the current articles were not written with this in mind, but I think that future articles should be.


Conclusions

Some strong criticisms were raised about the process of knowledgeblogging; I think that these are mostly valid, but I think are addressable with some changes to the process, some extensions to wordpress and, above all, better communication with the authors about what is required.

Despite the short-comings of the process, in one short meeting, we did generate substantial content and this, in turn, has generated significant interest. In the last month (February) alone, Ontogenesis has had around 1000 page views, which is significant compared to the average scientific book. Of course, this success is mostly the success of the content and not the process; if the process is not tailored further to the needs of the authors, they will not continue to contribute.

The process has fulfilled one key need; it has provided publicity and an audience in a timely manner. For this reason, I think that our initial experiment with knowledgeblogging has been a success — limited, guarded and in need of improvement, but a success none the less. Hopefully, we can build on this for the future.

1 Comment

  1. Why multiple authors? | The Knowledgeblog Process

    August 22, 2012 @ 10:04 am

    […] was one of our original justifications for using blog technology rather than wiki’s (http://ontogenesis.knowledgeblog.org/691). We can however make a split between two authorship roles […]

Leave a comment

Login