Debugging Semantic Web sites with cURL

Here at our group we spend a lot of time preaching the benefits of dereferenceable URIs. We often want to know if a certain URI supports all the fancy HTTP tricks that are the cornerstones of RDF publishing best practices, like 303 redirects and content negotiation.

My tool of choice for this is cURL, a command-line HTTP client that makes a useful addition to any Semantic Web developer’s toolbox. This tutorial shows how to use cURL to test Semantic Web URIs and to diagnose some common problems.

Getting cURL: Windows users can get cURL binaries from here, the first “non-SSL binary” version will work. Find curl.exe in the archive and drop it somewhere on the path, e.g. in C:\Windows. On Mac OS X and most Linux versions cURL is pre-installed.

To test cURL, open a command prompt and invoke

curl http://example.com/

You should see the HTML source code of the Example Web Page.

So let’s see some of the things we can do with cURL.

Checking content types: On the Web, content types are used to distinguish between content in different formats, e.g. human-readable HTML (Content-Type: text/html) and machine-readable RDF/XML data (Content-Type: application/rdf+xml). When you request a URI, the server sends the content type and other HTTP headers along with the response. Many Semantic Web clients don’t work properly unless RDF content is served with the correct content type.

To check this with cURL, use the -I parameter. This will show the HTTP headers sent by the server.

curl -I http://sites.wiwiss.fu-berlin.de/suhl/bizer/foaf.rdf

The URL is the FOAF file of Chris Bizer. Result:

HTTP/1.1 200 OK
Content-Length: 13746
Content-Type: application/rdf+xml
Last-Modified: Thu, 18 Jan 2007 10:27:22 GMT
Accept-Ranges: bytes
ETag: "bf3d723deb3ac71:54d"
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Tue, 06 Feb 2007 10:52:51 GMT

The important line is the Content-Type header. We see that the file is served as application/rdf+xml, just as it should be. If we would see text/plain here, or if the Content-Type header was missing, then the server configuration would need fixing.

Checking for 303 redirects: RDF publishers often use 303 redirects to distinguish between URLs for Web documents and URIs for Semantic Web resources. The idea is that when I fetch the URI of a non-document thing (e.g. a person or country or OWL class), then the response will send me to the location of a document describing the thing. Let’s see if the FOAF vocabulary correctly implements 303 redirects. What happens if I fetch foaf:knows?

curl -I http://xmlns.com/foaf/0.1/knows

Response:

HTTP/1.1 303 See Other
Date: Mon, 05 Feb 2007 19:09:55 GMT
Server: Apache/1.3.37 (Unix)
Location: http://xmlns.com/foaf/0.1/
Content-Type: text/html; charset=iso-8859-1

There’s the 303 status code, and the Location header gives the URL of the document that describes the foaf:knows property. In this case the FOAF specification.

If we got a 200 OK status code instead, then the URI would need fixing because foaf:knows is an RDF property and not a document.

Content negotiation: Good Semantic Web servers are configured to do another trick: They will redirect Semantic Web browsers to RDF documents, while plain old Web browsers are sent to HTML documents. To simulate a Semantic Web browser, we have to send an HTTP header Accept: application/rdf+xml along with the request. This is done using cURL’s -H parameter:

curl -I -H "Accept: application/rdf+xml" http://www4.wiwiss.fu-berlin.de/dblp/resource/person/103481

Response:

HTTP/1.1 303 See Other
Date: Tue, 06 Feb 2007 11:23:55 GMT
Server: Jetty/5.1.10 (Windows 2003/5.2 x86 java/1.5.0_09
Location: http://www4.wiwiss.fu-berlin.de/dblp/sparql?query=DESCRIBE+%3Chttp%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdblp%2Fresource%2Fperson%2F103481%3E
Content-Type: text/plain

If we send the same request without the header, we get:

HTTP/1.1 303 See Other
Date: Tue, 06 Feb 2007 11:25:20 GMT
Server: Jetty/5.1.10 (Windows 2003/5.2 x86 java/1.5.0_09
Location: http://www4.wiwiss.fu-berlin.de/dblp/page/person/103481
Content-Type: text/plain

And checking the two locations we will find that the first one serves RDF/XML, while the second one serves HTML.

Summary: So here’s how to examine URIs with cURL.

Check the contents that a normal web browser will see:

curl <uri>

Check the response headers that a normal web browser will see:

curl -I <uri>

Check the contents that a Semantic Web browser will see:

curl -H "Accept: application/rdf+xml" <uri>

Check the response headers that a Semantic Web browser will see:

curl -I -H "Accept: application/rdf+xml" <uri>

You can’t tell if a URI will work on the Semantic Web just by opening it in a Web browser. But you can tell with cURL.

This entry was posted in General, Semantic Web. Bookmark the permalink.

9 Responses to Debugging Semantic Web sites with cURL

  1. Joe Betz says:

    Hi Richard,

    For posterity. I also use the -L option quite often. This will follow the ‘see also’ headers and return the contents (or headers) of the URI containing the actual semantic data.

  2. Nelson de Moura says:

    After curl post, the Location: in the header is returning with relative url.

    i.e. /PessoasHomolog/Candidatos/Apresentacao.aspx?uid=CPds5pFRP7NZVtoYaDvA33DYn5i2VO8k9z2TNXgSurEku1B7JcFD4DuynO%2fpq6dW

    And when I set the CURLOPT_FOLLOWLOCATION to follow the Location, they put my domain before the Location:

    i.e. http://www.mydomain.com/PessoasHomolog/Candidatos/Apresentacao.aspx?uid=CPds5pFRP7NZVtoYaDvA33DYn5i2VO8k9z2TNXgSurEku1B7JcFD4DuynO%2fpq6dW

    What I´m lookin for is a way to merge “http://www.externaldomain.com” with the Location: in results.

    i.e. http://www.externaldomain.com/PessoasHomolog/Candidatos/Apresentacao.aspx?uid=CPds5pFRP7NZVtoYaDvA33DYn5i2VO8k9z2TNXgSurEku1B7JcFD4DuynO%2fpq6dW

    Thanks for any help.
    nelsonmoura@gmail.com

  3. osorio says:

    Nelson- could you be a little more specific about what you are tiring to say?
    It looks like your tiring to redirect a url maybe someone will email you a solution that is if you haven’t solved it already.

  4. Pingback: Linked Data #1 – What is Linked Data? An easy explanation attempt… « Data.Information.Knowledge.Web

  5. Pingback: Knowledge Hives » Linking Public Vocabularies - openvocabulary.info

  6. Pingback: Blogabriel » Traduction : How to Publish Linked Data on the Web? (7/10)

  7. Pingback: Blogabriel » Traduction : How to Publish Linked Data on the Web? (8/10)

  8. Pingback: Blogabriel » Traduction : How to Publish Linked Data on the Web? (10/10)

  9. Pingback: Blogabriel » Traduction française : How to Publish Linked Data on the Web?