Here at our group we spend a lot of time preaching the benefits of dereferenceable URIs. We often want to know if a certain URI supports all the fancy HTTP tricks that are the cornerstones of RDF publishing best practices, like 303 redirects and content negotiation.
My tool of choice for this is cURL, a command-line HTTP client that makes a useful addition to any Semantic Web developer’s toolbox. This tutorial shows how to use cURL to test Semantic Web URIs and to diagnose some common problems.
Getting cURL: Windows users can get cURL binaries from here, the first “non-SSL binary” version will work. Find
curl.exe in the archive and drop it somewhere on the path, e.g. in
C:\Windows. On Mac OS X and most Linux versions cURL is pre-installed.
To test cURL, open a command prompt and invoke
You should see the HTML source code of the Example Web Page.
So let’s see some of the things we can do with cURL.
Checking content types: On the Web, content types are used to distinguish between content in different formats, e.g. human-readable HTML (
Content-Type: text/html) and machine-readable RDF/XML data (
Content-Type: application/rdf+xml). When you request a URI, the server sends the content type and other HTTP headers along with the response. Many Semantic Web clients don’t work properly unless RDF content is served with the correct content type.
To check this with cURL, use the
-I parameter. This will show the HTTP headers sent by the server.
curl -I http://sites.wiwiss.fu-berlin.de/suhl/bizer/foaf.rdf
The URL is the FOAF file of Chris Bizer. Result:
HTTP/1.1 200 OK Content-Length: 13746 Content-Type: application/rdf+xml Last-Modified: Thu, 18 Jan 2007 10:27:22 GMT Accept-Ranges: bytes ETag: "bf3d723deb3ac71:54d" Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET Date: Tue, 06 Feb 2007 10:52:51 GMT
The important line is the
Content-Type header. We see that the file is served as
application/rdf+xml, just as it should be. If we would see
text/plain here, or if the
Content-Type header was missing, then the server configuration would need fixing.
Checking for 303 redirects: RDF publishers often use 303 redirects to distinguish between URLs for Web documents and URIs for Semantic Web resources. The idea is that when I fetch the URI of a non-document thing (e.g. a person or country or OWL class), then the response will send me to the location of a document describing the thing. Let’s see if the FOAF vocabulary correctly implements 303 redirects. What happens if I fetch
curl -I http://xmlns.com/foaf/0.1/knows
HTTP/1.1 303 See Other Date: Mon, 05 Feb 2007 19:09:55 GMT Server: Apache/1.3.37 (Unix) Location: http://xmlns.com/foaf/0.1/ Content-Type: text/html; charset=iso-8859-1
There’s the 303 status code, and the
Location header gives the URL of the document that describes the
foaf:knows property. In this case the FOAF specification.
If we got a
200 OK status code instead, then the URI would need fixing because
foaf:knows is an RDF property and not a document.
Content negotiation: Good Semantic Web servers are configured to do another trick: They will redirect Semantic Web browsers to RDF documents, while plain old Web browsers are sent to HTML documents. To simulate a Semantic Web browser, we have to send an HTTP header
Accept: application/rdf+xml along with the request. This is done using cURL’s
curl -I -H "Accept: application/rdf+xml" http://www4.wiwiss.fu-berlin.de/dblp/resource/person/103481
HTTP/1.1 303 See Other Date: Tue, 06 Feb 2007 11:23:55 GMT Server: Jetty/5.1.10 (Windows 2003/5.2 x86 java/1.5.0_09 Location: http://www4.wiwiss.fu-berlin.de/dblp/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdblp%2Fresource%2Fperson%2F103481%3E Content-Type: text/plain
If we send the same request without the header, we get:
HTTP/1.1 303 See Other Date: Tue, 06 Feb 2007 11:25:20 GMT Server: Jetty/5.1.10 (Windows 2003/5.2 x86 java/1.5.0_09 Location: http://www4.wiwiss.fu-berlin.de/dblp/page/person/103481 Content-Type: text/plain
Summary: So here’s how to examine URIs with cURL.
Check the contents that a normal web browser will see:
Check the response headers that a normal web browser will see:
curl -I <uri>
Check the contents that a Semantic Web browser will see:
curl -H "Accept: application/rdf+xml" <uri>
Check the response headers that a Semantic Web browser will see:
curl -I -H "Accept: application/rdf+xml" <uri>
You can’t tell if a URI will work on the Semantic Web just by opening it in a Web browser. But you can tell with cURL.