Back from ESWC 2008

The European Semantic Web Conference is over and I’m back in Galway. It was a lot of fun, two hundred smart and interesting people in a sea-side tourist resort, and the thought “I can’t believe I’m getting paid for this” crossed my mind a few times.

It was good to see that some of the topics close to my interest — SPARQL, Linked Data, the LOD project, Semantic Search — were well-represented in the conference program and in the attention of the crowd.

I gave four talks:

1. A five-minute lightning talk on Sindice.com in Sunday’s spontaneous and surprisingly well-attended LOD gathering. Slides (PDF)

2. Neologism: Easy Vocabulary Publishing, together with Sergio Fernández, in the SFSW workshop, about our soon-to-be-released RDFS vocabulary editor and publishing system. Slides (PDF)

3. Semantic Sitemaps about DERI’s proposed Semantic Sitemaps standard for improved discovery of RDF data published on the Web. Slides (PDF)

4. There’s more than US-ASCII, a two-minute lightning talk in which I complain bitterly about the inability of Semantic Web developers to properly handle characters outside of the usual US-ASCII charset in their apps. The single slide is reproduced below.

There's more than US-ASCII (presentation slide)

The best part was meeting in person for the first time some of the people I’m working with on a regular basis, including Michael Hausenblas, Keith Alexander, Hugh Glaser, Andreas Langegger, and Olaf Hartig, and making some exciting new connections. Email is great, but sitting down face to face still beats it by an order of magnitude in effectivity, and also in fun if drinks are involved.

Posted in General, Semantic Web | 2 Comments

What is your RDF browser’s Accept header?

I was debugging some content negotiation related issue the other day and made a little tool that allows me to find out what Accept header different RDF-aware HTTP clients send. If you ever need to know the Accept header of a particular RDF-aware HTTP client, just make it show the RDF loaded from this URI:

http://richard.cyganiak.de/2008/03/rdfbugs/accept.php

The RDF contains a dc:description with the browser’s Accept header. If you have the Tabulator Firefox extension installed, you can simply click the link and see the output.

I tried this with a couple of tools and here are the results:

Tabulator Firefox Extension 0.8.2:

application/rdf+xml, application/xhtml+xml;q=0.3, text/xml;q=0.2,
application/xml;q=0.2, text/html;q=0.3, text/plain;q=0.1, text/n3,
text/rdf+n3;q=0.5, application/x-turtle;q=0.2, text/turtle;q=1

Jena’s Model.read(…) method:

application/rdf+xml, application/xml; q=0.8, text/xml; q=0.7,
application/rss+xml; q=0.3, */*; q=0.2

Disco Hyperdata Browser:

application/rdf+xml;q=1,text/xml;q=0.6,text/rdf+n3;q=0.9,
application/octet-stream;q=0.5,application/xml q=0.5,
application/rss+xml;q=0.5,text/plain; q=0.5,application/x-turtle;q=0.5,
application/x-trig;q=0.5,text/html;q=0.5

OpenLink RDF Browser:

application/rdf+xml, text/rdf+n3, application/rdf+turtle,
application/x-turtle, application/turtle, application/xml, */*

SindiceBot:

application/rdf+xml, application/xml;q=0.6, text/xml;q=0.6

Some of these are pretty funny actually, but that’s a post for another day.

Posted in General, Semantic Web | 7 Comments

Tabulator does N3

In my podcasted chat with Danny Ayers the other day I said that Tabulator doesn’t support N3, the highly readable RDF serialization syntax developed by Tim Berners-Lee.

It turns out I was wrong. I thought Tabulator supported RDF/XML only. But it turns out Tabulator has excellent support for N3 as well. I’m not sure how I managed to miss this. Seems like the Tab’r team sneaked that feature in while no one (or at least I) was not looking!

To make Tabulator eat your N3, you just have to make sure it’s served with the right content type: text/rdf+n3. If you publish N3, you can test this by installing cURL (see my earlier cURL for SemWebbers tutorial) and running:

curl -I "http://example.com/myfile.n3"

If your server is set up correctly, then the output contains a line like this:

Content-Type: text/rdf+n3; charset=utf-8

If you see e.g. text/plain instead, then your server is misconfigured. If you serve static files with Apache, you can fix this by adding this line to your Apache’s httpd.conf file or to a file called .htaccess in your DocumentRoot:

AddType text/rdf+n3;charset=utf-8 .n3

And then make sure that your filenames end in .n3.

Actually, this is really good news. As far as I’m concerned, the Tabulator Firefox extension, being the most readily accessible Semantic Web client currently out there, defines what is on the Semantic Web and what isn’t. If you can’t browse it in Tabulator, then it isn’t. (Insert caveat about RDFa and SPARQL here, which Tabulator probably will support in the future.)

I really like N3, because it is a much friendlier format than the alternative RDF/XML. It is very easy to read, generate, and even hand-write. More so than, say, JSON. A Semantic Web built on N3 would be a much nicer place than a Semantic Web built on RDF/XML. Tool support for N3 is still not quite as good as for RDF/XML, but we are getting there.

So, what does this mean in practice?

  1. Do you develop or maintain an application that consumes RDF from the Web? If you do, then you should make sure that it understands both RDF/XML and N3.
  2. Do you develop or maintain a library, framework or API that can load and parse RDF from the Web, from a URI? If you do, then you should make sure that it invokes the right parser, depending on the Content-Type header of the response. This should happen completely transparently from your users’ point of view.
  3. Do you author educational material, tutorials or slides that use RDF in examples? Do your audience a favor and do them in N3, not RDF/XML!
  4. If you already produce RDF/XML, as an information publisher, you shouldn’t worry. RDF-aware clients won’t stop supporting RDF/XML.

Making these things happen in the tools and documents that I myself maintain will be quite a bit of work. But I think it’s worth it. N3 is a bit like the old days of HTML, you can actually view source and understand what’s going on. N3 is the human-friendly way of writing RDF.

Posted in General, Semantic Web | 6 Comments

QOTD: timbl is a document

Simon Spero on the SKOS list:

The meaning of “documet” in this context is extremely broad; if we follow Otlet’s definition of a document as anything which can convey information to an observer, the term would seem to cover anything which can have a subject.

By this standard, timbl is a document, but only when someone’s looking.

Ah, the Semantic Web community! Please leave your common sense at the door …

Posted in General | 3 Comments

I named 61 HTML elements in 5 minutes.

61

(via Dominik Wagner)

Posted in General | 1 Comment

A challenge for semantic query wizards

Bernard Vatant:

A challenge for semantic query wizzards: Find the smallest connected graph having at least a node in each LOD data set

Interesting. The hardest part might be finding paths through the FOAF bubble.

Any takers?

Posted in General, Semantic Web | Comments Off

Finding Rahoon

Rahoon is a nondescript area of Galway, the third-largest city in the Republic of Ireland. Unsurprisingly, I had never heard about Rahoon before last week, when I moved there, to join Giovanni Tummarello’s team at DERI Galway.

Moving is stressful. But being an RDF geek, I not only have to drag physical stuff around, but I also have to update a few triples. For example, by joining DERI I got yet another URI:

http://www.deri.ie/fileadmin/scripts/foaf.php?id=313#me

That’s a new owl:sameAs for my FOAF file. I also have to update my foaf:based_near triple. Until now, my FOAF file stated:

<#cygri> foaf:based_near <http://dbpedia.org/resource/Berlin> .

The obvious new value for this property would be http://dbpedia.org/resource/Galway. But I want to be a bit more specific. That’s where this post’s title comes in. *What is the URI of Rahoon?*

The big datasets: Unfortunately, Wikipedia doesn’t have an article for Rahoon, hence it isn’t in DBpedia.

Then there’s Geonames, the most comprehensive source of geographical information on the Semantic Web. It has an entry for a nearby structure called Rahoon House, but not for the area itself.

The search engines: Next I tried all the Semantic Web search engines from the Linking Open Data project’s list.

Of the seven available services, only three produced any results. I was quite disappointed that the venerable Swoogle, one of the first large-scale Semantic Web indices, didn’t return any results at all.

SWSE (a DERI project) returned two hits, but neither of them was semantic (a web page mentioning Rahoon and a Bollywood-related RSS item).

Sindice (disclosure — it’s another DERI project, and I am now a team member) did much better, it found Geonames’ Rahoon House, and a bunch of loosely Rahoon-related things from DBpedia, but it didn’t turn up any URI for the Rahoon area itself.

Finally there is Falcons, a recently announced project developed at Nanjing’s Southeast University. Its results are similar to Sindice’s, except that it missed the Geonames entry.

In summary, only Falcons and Sindice found anything of interest, but neither struck gold.

Mint your own? At this point, I perhaps have to accept that the only existing relationship between RDF and Rahoon is the fact that Giovanni has been living here for a while, and that I’m not going to find a URI for the place.

The usual advice at this point is to mint a new URI for the thing in question. But I don’t want to go down this road, because I simply do not feel sufficiently competent in matters of Irish geography. What ”is” Rahoon, really? An administrative area? A geographical region? A postal code? A bus stop? I don’t know.

Based near a blank node: My solution is to ignore the widely accepted wisdom that RDF blank nodes are considered harmful. I will state that I’m living near something, and describe that something as good as I can:

foaf:based_near [
    a pos:Point;
    pos:lat "53.27702";
    pos:long "-9.09019";
    rdfs:label "125 Rosan Glas, Rahoon, Galway, Ireland";
    geo:parentFeature <http://sws.geonames.org/2964180/>
];

The skeleton of this N3 fragment was easily created using my FOAF geolocator (which I recently extended with address search and N3 output — have a look at it if your FOAF file still lacks geolocation!). I added a label and a link to the next-largest Geonames feature (Galway), which I easily found with Sindice.

Rahoon is still without a URI, but I guess I should let it be for now and rather worry about applying for a social security number, registering for taxation, and so forth.

DERI! At any rate, I’m happy to be here and look forward to working with a great team on some very exciting projects.

Posted in General, Semantic Web | 12 Comments

LazyWeb request

a) A Sudoku solving web service.
b) A Sudoku generating web service.
c) Hook them up to each other.

Posted in General, Semantic Web | 4 Comments

Objectviewer: Yet another linked data browser

Via Troy Self’s introduction to the Linking Open Data list I came across the Objectviewer, yet another Semantic Web browser based on the linked data principles. This increases the number of available Semantic Web browser prototypes to four: Tabulator, Disco, the OpenLink Ajax Toolkit browser, and now ObjectViewer.

ObjectViewer has quite a nice visualization of the browsed resource as a simple graph, which isn’t really all that useful in practice, but always makes for stunning demos. A live example on some dbpedia data is here (the browser chrome is missing, I couldn’t figure out how to make a proper direct link), and a screenshot is below.

Objectviewer screenshot

Webby data everywhere!

Posted in General, Semantic Web | 2 Comments

Neil Bartlett: “StatSVN helps startups get funded”

Neil Batlett has an interesting take on StatSVN and StatCVS:

One problem that startup companies often have is demonstrating to investors that they’re actually doing something productive rather than just pouring away money on office plants, Herman Miller chairs, and playing foosball all day. … One thing you can do is show the evolution of your code over a period of time using a tool like StatSVN.

Lines of code are certainly not the most meaningful numbers, but they are a nice and simple way of demonstrating activity. Sometimes that’s all you need.

Posted in General | Tagged | 6 Comments