You don’t need URI prefixes in RDF queries

Update: Everyone hates the idea; some for good reasons.

Properties and classes in RDF are identified by URIs. This is important because we want to be able to say additional things about them. But it has a cost. It makes RDF harder and uglier. Just think about the time you’ve spent copy and pasting prefix declarations and hunting for the right namespace URI for some vocabulary. Still, that’s a cost we have to pay in a system without a centralized schema.

Not so with queries. Check out this one:

PREFIX foaf: <http ://xmlns.com/foaf/0.1/>
PREFIX doap: <http ://usefulinc.com/ns/doap#>
SELECT DISTINCT ?projectName ?personName
WHERE { 
  ?person foaf:name ?personName .
  ?person doap:project ?project .
  ?project doap:name ?projectName .
}

The prefixes in this query are utterly superfluous. They are noise. They are ugly. They are a pain. They cause errors. They kill serendipity. All they provide, if anything, is a false sense of security.

Make them optional! Here’s a better version:

SELECT DISTINCT ?projectName ?personName
WHERE {
  ?person :name ?personName .
  ?person :project ?project .
  ?project :name ?projectName .
}

The query processor should match the QNames regardless of namespace. Thus, :name would match both foaf:name and doap:name. Writing SPARQL queries could actually be fun.

So I think that, if a query doesn’t declare the default namespace, then the default namespace should be understood to match any namespace.

This entry was posted in General, Semantic Web. Bookmark the permalink.

7 Responses to You don’t need URI prefixes in RDF queries

  1. Evan says:

    -1, for a few reasons. First, foaf:name and doap:name (to use your example) are very different, and lumping them together should only be done if the user explicitly specifies that it should be done, and it is likely there would be far more accidental collisions than purposeful collisions.

    Because we’re operating under an open-world assumption, we can’t know what to expect from our data. What happens if the wine example (from the OWL spec) creeps into our dataset? I’m not sure, but there is probably a wine:name in there. How would you like to be scrolling through a list of people and all of a sudden you come across “Cabernet Sauvignon”?

    There is also the fact that, IMHO, we should try to be similar to SQL where possible. Your idea would be quite counter-intuitive for a lot of people coming from SQL backgrounds, because in SQL if more than one table has a column you’re trying to select, the query will fail unless you explicitly specify which table you want the data from.

    Finally, from an implementation perspective this would be extremely annoying, since URIs would have to be stored as two distinct objects. This would make a lot of implementations a lot slower.

  2. Ora Lassila says:

    I meant I disagree with the original post, not with Evan’s comment (with which I agree).

  3. drewp says:

    I also think the whole plan is poor, and I wonder why you’d even keep the colons if you were doing random-suffix-matching like that.

    But, it seems like there might be a useful trick buried in here if foaf:name and doap:name both had rdfs:label of “name”. Your “unresolved” predicates could each be replaced with a variable that matches a predicate with rdfs:label “name” that’s in one of the namespaces that you know about (here, it would be foaf and doap).

    SELECT DISTINCT ?projectName ?personName
    WHERE {
    ?person ?name0 ?personName . # “name” gets translated to “?name0″
    ?person ?project0 ?project .
    ?project ?name1 ?projectName .

    # below here can be generated automatically
    ?name0 rdfs:label “name”; :partOf foaf: .
    ?name1 rdfs:label “name”; :partOf foaf: .
    ?project0 rdfs:label “project”; :partOf foaf: .
    }

    My impromptu sparql is not good enough to specify “foaf *or* doap”, and I don’t know how :partOf is really written. But otherwise, it’s valid, well-constrained sparql that doesn’t need a special processor (after the addition of my automatic stuff). It won’t mysteriously change behavior if I merge in some data that uses the predicate drewp:name. The query results could tell you where name0 and name1 got resolved. I think it’s still a silly plan, but it might possibly have a use if the user is insisting on omitting prefixes.

  4. drewp says:

    that huge gap above is actually a comment that says “below here can be generated automatically”

  5. Danny says:

    Madness! Could be fun to try though – regexs are possible on URIs I think..? Make like there’s consensus on naming (microformats anybody?).

  6. Henry Story says:

    I think you just posted this to get attention ;-)

    I like the way you do it with snorql in D2RQ server. On the web interface you write out the prefixes the database knows about, which saves one having to type out those prefixes and type them out, as in the roller sparql interface here:
    http://roller.blogdns.net:2020/snorql/