- How much work is it going to be and what timeframe is realistic?
- How broad and how deeply should you cover the domain? Where to stop?
- Work alone or seek collaborators?
- Should you start by setting up a mailing list, or by producing a first draft?
- How much documentation do you need to produce?
- Whose feature requests and modeling ideas should you heed and whose ignore?
- How to keep pushing towards the uncertain goal of “adoption” in the face of limited time?
A few days ago, the VoID vocabulary became a W3C SWIG Note. VoID started in 2008 as a loose collaboration between Jun Zhao, Keith Alexander, Michael Hausenblas and me. We published a first non-W3C version in 2009. The W3C publication is a nice milestone for us, and I thought this a good opportunity to share some of the lessons I have learned along the way.
I will focus on process and collaboration in this post, and say little about modeling practices or publishing tools or RDFS/OWL geekery.
Lesson #1: Work in a team. Three or four people, each with their own use cases or data, might be ideal. It ensures that a variety of use cases are covered; fluctuations in available time don’t stall the project; it mellows any strong personal hand-writing in the modeling and design; it increases the network available for reaching out to potential users. Having a team of a few motivated people is perhaps the most important factor for success.
Lesson #2: Take your time. For all of us, VoID was a low-priority “background task”. We all get paid for doing other things. Inevitably, progress was often slow, with months where literally nothing happened. I probably averaged less than an hour of VoID work per week (with occasional major bursts of activity).
And that might be the best way. Progress in vocabulary design is not how quickly one produces a polished spec. Progress is learning about the needs of the potential user community. Going slow means more opportunity for feedback at every stage, and reduces the risk of creating something that nobody needs.
We also moved the vocabulary to a different host twice in the process. This worked out ok because we could retain the original namespace URI throughout the moves, but it definitely shows the advantage of going with something like purl.org from the start.
Lesson #3: Use a public issue tracker. This is crucial, even if you work alone. It adds structure to the work process and helps to ensure that no balls get dropped. Some issues will remain unresolved for long periods of time, and you need a place for collecting the random comments, discussions, related links, proposed text for changes and so on.
I think it’s important to use a tracker that is easy to work with, ideally one that the contributors are already familiar with. We used the one from Google Code. It’s simple and just works.
Setting up a Google Code project for developing the vocabulary worked very well for us. Besides the tracker, we also used the SVN repository for the spec, and the simple wiki for random bits of information, like lists of deployments, and examples that didn’t fit into the spec.
Don’t try to use a wiki or Google Doc or other funky collaboration device in place of an issue tracker. I’ve seen that done elsewhere and it doesn’t work.
Lesson #4: Perfection can wait till the next version. This sounds banal, but is so important. At some point quite a while ago, we were all quite fed up and just wanted to get something out of the door. So we decided not to tackle a lot of difficult open issues. We told ourselves that we would just do them in a second version. This turned out to be immensely liberating.
After version 1, we took a long break, and then started to work on version 2. Now we knew that deferring to the next version is always an option (which we used liberally). Not really clear if that use case is worth the effort? Defer. Not enough evidence or experience to inform the design? Defer. Two pig-headed contributors (that is, Keith and me) can’t agree on a design? Defer.
Lesson #5: Regular Skype calls. This one might be controversial, because no one likes wasting time in weekly conference calls. But I think it worked well for us. We didn’t quite do weekly calls, but scheduled them ad hoc, averaging perhaps one every two weeks. Often, the only progress between calls was that one of us felt a bit of shame and quickly did one or two of their actions in the thirty minutes before the call. This adds up over the months and makes sure that there is slow but steady progress.
We took turns chairing and scribing. The chair would take us through the agenda (typically “review open actions; review issues list; discuss particularly thorny issue XYZ; AOB; schedule next call”) and interrupt any discussion that started to go circular. The scribe would note whenever someone took an action to do something, and afterwards email a list of those and the date for the next call. A good call duration is somewhere between 60 and 90 minutes.
Lesson #6: Have a working draft of the spec from day one. Even if it’s just a few scribbles. Call them your working draft and take it from there. Then get into the habit of focussing any discussion on the question: What change should be made to the text? Arguing about words that should go into the text is much more productive than the alternative, which is arguing who is right or wrong. Ideally, whenever people start to disagree, they should draft up competing change proposals to be discussed in the next call.
Besides the spec text in SVN, we used Neologism to create and publish the actual RDFS vocabulary specification.
Lesson #7: Public mailing list is optional. Don’t you hate signing up to yet another mailing list? Me too. We started with a private mailing list, and found that its only real use was for notifications from the issue tracker. Discussion happened on Skype or in the tracker. We put external comments into the tracker too and discussed them there. This worked well.
This is about the creation phase of the vocabulary. It might be a different story once you get a bit of a user community going. We now have a public discussion list.
Lesson #8: Start over a beer and a large piece of paper. If you can. With everyone physically in the same room. That’s how we did it anyways, at a conference, and it was quite helpful for figuring out a core part of the vocabulary that seemed uncontroversial. Most of that time was spent arguing about—I’m sure this will come to no surprise to you—a name for the project.