URIs have a namespace part and a local part, right?

This is a technical post on the way URIs break down into “namespace parts” and “local parts” in RDF. It was prompted by this comment in a recent discussion:

In a URI, the namespace part ends with the last slash or hash, right?

So, the namespace of <http://example.com/foo/bar-123> ends after /foo/, and bar-123 is the local part, right?

The answer is neither yes nor no. The question is based on a wrong assumption.

Let me explain.

Namespaces in RDF are a strange beast

On the one hand, one could say that they are purely a syntactic convenience for shortening repetitive URIs, and carry no syntactic meaning.

On the other hand, one could say that they are an integral part of the “user interface” to RDF data. Users, for example those writing SPARQL queries, much prefer to interact with the data in its prefix-abbreviated form and not through the full URIs.

Different parts of the RDF stack embody these different views. For example, the N-Triples syntax doesnt support namespace abbreviation at all. But the RDF/XML syntax doesn’t work without namespace abbreviation (and furthermore makes it impossible to abbreviate certain valid URIs).

So where does the namespace part end???

It’s sloppy to say that URIs have “namespace parts” and “local parts”. Rather, it would be more accurate to say:

Given a certain prefix mapping, a URI can be broken up into a namespace part and a local part, possibly in different ways.

Consider this prefix mapping, written in Turtle:

@prefix a: <http://example.com/>.
@prefix b: <http://example.com/foo/>.
@prefix c: <http://example.com/foo/bar->.

Now, given this prefix mapping, the URI <http://example.com/foo/bar-123> can be broken up into namespace and local parts in three different ways, yielding three different local names:

a:foo/bar-123
b:bar-123
c:123

Now a couple of observations:

  1. There’s nothing inherently special about hashes or slashes in URI patterns, as shown in the third prefix.
    It’s a common convention to define prefix mappings that go up to the last hash or slash (again due to the influence of old RDF/XML where you couldn’t have hashes or slashes in local names), and many tools that automatically create prefix mappings will do this, so you will see b:bar-123 more often than the other forms. But that’s merely a convention, and the other forms may well be more convenient for users sometimes.
  2. The form a:foo/bar-123 actually needs to be escaped as a:foo\/bar-123 if written in Turtle or SPARQL, because unescaped slashes are not allowed in the local part of a prefixed name.
  3. This escape mechanism was only introduced in the W3C Recommendation version of Turtle and in SPARQL 1.1, so may not work in older parsers, and community awareness of this form is regrettably low.
  4. The form c:123 will work in Turtle and SPARQL but not in RDF/XML, because XML requires that element names start with a letter or underscore. So, in RDF/XML, only b:bar-123 works.

To summarise, URIs don’t simply have a namespace part and local part. Rather, someone defines a prefix mapping, and under that prefix mapping, there may be zero, one or more ways of abbreviating any given URI.

This entry was posted in General, Semantic Web. Bookmark the permalink.