cygri’s notes on web data

Multiple Java versions on OS X, and their paths

Posted on August 18, 2009 by Richard Cyganiak

Java version management on OS X is a wee bit complicated. Here’s what I understand.

All different versions of Java are installed into the directory:
/System/Library/Frameworks/JavaVM.framework/Versions/

For example, the JDK home directory for 1.4.2 would be Versions/1.4.2/Home/.

There are three ways to access specific versions.

The hardcoded default. There is the hardcoded default version that comes with the current version of the OS. For OS X Leopard, this is Java 1.5. This version can always be accessed through the path /Library/Java/Home.
The Java Preferences application. recent versions of Java on OS X install an application “Java Preferences” into /Applications/Utilities. It allows you to select the preferred version by dragging it to the top of the list. There is one list for applets, and one list for applications and the command line. The terminal commands (e.g., java) use that version from the second list. The path of this version, in case you need it, can be obtained by running the command /usr/libexec/java_home.
Setting JAVA_HOME. By doing so one can override the choice that is made in the Java Preferences application. If the JAVA_HOME is set, the command line applications will use that version. However, /usr/libexec/java_home will still return the path of the version that was selected in Java Preferences.

“Magic” versions. In addition to the different Java versions, the Versions directory also contains several “magic” directories:

Versions/CurrentJDK/ is the hardcoded default version of the OS, so on Leopard it will always be an alias pointing to /Versions/1.5. Note that this is not affected by whatever version is selected in Java Preferences or via JAVA_HOME. Messing manually with the symlink to make it point to a different version is probably not a good idea. /Library/Java/Home points here.

Versions/Current/ is an alias that points to Versions/A/. This, in turn, is not a proper Java version like the other directories in /Versions/. It contains internal parts of the OS X Java machinery, e.g., in Versions/Current/Commands there are “fake” binaries such as java, javac and javadoc that internally use /usr/libexec/java_home (and JAVA_HOME, if set) to find the “real” binary. When you call Java commands from the command line, you actually invoke these “fake” binaries (via symlinks in /usr/bin). These are system internals, and poking around in there too much is probably not a good idea.

Finding the current version. So, what is the correct way to determine the location of the current Java version on OS X, say from a shell script?

If JAVA_HOME is set, use that.
Otherwise, invoke /usr/libexec/java_home to find the path.
If that fails, fall back to /Library/Java/Home.

Should you set JAVA_HOME? If you don’t need it, don’t set it at all. If you need it, setting it via

export JAVA_HOME=`/usr/libexec/java_home`

is probably not a bad idea, because this will reflect future changes to the selected version from the Java Preferences application.

Posted in General | 2 Comments

Back from ESWC 2008

Posted on June 9, 2008 by Richard Cyganiak

The European Semantic Web Conference is over and I’m back in Galway. It was a lot of fun, two hundred smart and interesting people in a sea-side tourist resort, and the thought “I can’t believe I’m getting paid for this” crossed my mind a few times.

It was good to see that some of the topics close to my interest — SPARQL, Linked Data, the LOD project, Semantic Search — were well-represented in the conference program and in the attention of the crowd.

I gave four talks:

1. A five-minute lightning talk on Sindice.com in Sunday’s spontaneous and surprisingly well-attended LOD gathering. Slides (PDF)

2. Neologism: Easy Vocabulary Publishing, together with Sergio Fernández, in the SFSW workshop, about our soon-to-be-released RDFS vocabulary editor and publishing system. Slides (PDF)

3. Semantic Sitemaps about DERI’s proposed Semantic Sitemaps standard for improved discovery of RDF data published on the Web. Slides (PDF)

4. There’s more than US-ASCII, a two-minute lightning talk in which I complain bitterly about the inability of Semantic Web developers to properly handle characters outside of the usual US-ASCII charset in their apps. The single slide is reproduced below.

The best part was meeting in person for the first time some of the people I’m working with on a regular basis, including Michael Hausenblas, Keith Alexander, Hugh Glaser, Andreas Langegger, and Olaf Hartig, and making some exciting new connections. Email is great, but sitting down face to face still beats it by an order of magnitude in effectivity, and also in fun if drinks are involved.

Posted in General, Semantic Web | 2 Comments

What is your RDF browser’s Accept header?

Posted on March 17, 2008 by Richard Cyganiak

I was debugging some content negotiation related issue the other day and made a little tool that allows me to find out what Accept header different RDF-aware HTTP clients send. If you ever need to know the Accept header of a particular RDF-aware HTTP client, just make it show the RDF loaded from this URI:

http://richard.cyganiak.de/2008/03/rdfbugs/accept.php

The RDF contains a dc:description with the browser’s Accept header. If you have the Tabulator Firefox extension installed, you can simply click the link and see the output.

I tried this with a couple of tools and here are the results:

Tabulator Firefox Extension 0.8.2:

application/rdf+xml, application/xhtml+xml;q=0.3, text/xml;q=0.2,
application/xml;q=0.2, text/html;q=0.3, text/plain;q=0.1, text/n3,
text/rdf+n3;q=0.5, application/x-turtle;q=0.2, text/turtle;q=1

Jena’s Model.read(…) method:

application/rdf+xml, application/xml; q=0.8, text/xml; q=0.7,
application/rss+xml; q=0.3, */*; q=0.2

Disco Hyperdata Browser:

application/rdf+xml;q=1,text/xml;q=0.6,text/rdf+n3;q=0.9,
application/octet-stream;q=0.5,application/xml q=0.5,
application/rss+xml;q=0.5,text/plain; q=0.5,application/x-turtle;q=0.5,
application/x-trig;q=0.5,text/html;q=0.5

OpenLink RDF Browser:

application/rdf+xml, text/rdf+n3, application/rdf+turtle,
application/x-turtle, application/turtle, application/xml, */*

SindiceBot:

application/rdf+xml, application/xml;q=0.6, text/xml;q=0.6

Some of these are pretty funny actually, but that’s a post for another day.

Posted in General, Semantic Web | 7 Comments

Tabulator does N3

Posted on March 17, 2008 by Richard Cyganiak

In my podcasted chat with Danny Ayers the other day I said that Tabulator doesn’t support N3, the highly readable RDF serialization syntax developed by Tim Berners-Lee.

It turns out I was wrong. I thought Tabulator supported RDF/XML only. But it turns out Tabulator has excellent support for N3 as well. I’m not sure how I managed to miss this. Seems like the Tab’r team sneaked that feature in while no one (or at least I) was not looking!

To make Tabulator eat your N3, you just have to make sure it’s served with the right content type: text/rdf+n3. If you publish N3, you can test this by installing cURL (see my earlier cURL for SemWebbers tutorial) and running:

curl -I "http://example.com/myfile.n3"

If your server is set up correctly, then the output contains a line like this:

Content-Type: text/rdf+n3; charset=utf-8

If you see e.g. text/plain instead, then your server is misconfigured. If you serve static files with Apache, you can fix this by adding this line to your Apache’s httpd.conf file or to a file called .htaccess in your DocumentRoot:

AddType text/rdf+n3;charset=utf-8 .n3

And then make sure that your filenames end in .n3.

Actually, this is really good news. As far as I’m concerned, the Tabulator Firefox extension, being the most readily accessible Semantic Web client currently out there, defines what is on the Semantic Web and what isn’t. If you can’t browse it in Tabulator, then it isn’t. (Insert caveat about RDFa and SPARQL here, which Tabulator probably will support in the future.)

I really like N3, because it is a much friendlier format than the alternative RDF/XML. It is very easy to read, generate, and even hand-write. More so than, say, JSON. A Semantic Web built on N3 would be a much nicer place than a Semantic Web built on RDF/XML. Tool support for N3 is still not quite as good as for RDF/XML, but we are getting there.

So, what does this mean in practice?

Do you develop or maintain an application that consumes RDF from the Web? If you do, then you should make sure that it understands both RDF/XML and N3.
Do you develop or maintain a library, framework or API that can load and parse RDF from the Web, from a URI? If you do, then you should make sure that it invokes the right parser, depending on the Content-Type header of the response. This should happen completely transparently from your users’ point of view.
Do you author educational material, tutorials or slides that use RDF in examples? Do your audience a favor and do them in N3, not RDF/XML!
If you already produce RDF/XML, as an information publisher, you shouldn’t worry. RDF-aware clients won’t stop supporting RDF/XML.

Making these things happen in the tools and documents that I myself maintain will be quite a bit of work. But I think it’s worth it. N3 is a bit like the old days of HTML, you can actually view source and understand what’s going on. N3 is the human-friendly way of writing RDF.

Posted in General, Semantic Web | 6 Comments

QOTD: timbl is a document

Posted on January 24, 2008 by Richard Cyganiak

Simon Spero on the SKOS list:

The meaning of “documet” in this context is extremely broad; if we follow Otlet’s definition of a document as anything which can convey information to an observer, the term would seem to cover anything which can have a subject.

By this standard, timbl is a document, but only when someone’s looking.

Ah, the Semantic Web community! Please leave your common sense at the door …

Posted in General | 3 Comments

I named 61 HTML elements in 5 minutes.

Posted on November 26, 2007 by Richard Cyganiak

(via Dominik Wagner)

Posted in General | 1 Comment

A challenge for semantic query wizards

Posted on November 13, 2007 by Richard Cyganiak

Bernard Vatant:

A challenge for semantic query wizzards: Find the smallest connected graph having at least a node in each LOD data set

Interesting. The hardest part might be finding paths through the FOAF bubble.

Any takers?

Posted in General, Semantic Web | Comments Off

Finding Rahoon

Posted on November 13, 2007 by Richard Cyganiak

Rahoon is a nondescript area of Galway, the third-largest city in the Republic of Ireland. Unsurprisingly, I had never heard about Rahoon before last week, when I moved there, to join Giovanni Tummarello’s team at DERI Galway.

Moving is stressful. But being an RDF geek, I not only have to drag physical stuff around, but I also have to update a few triples. For example, by joining DERI I got yet another URI:

http://www.deri.ie/fileadmin/scripts/foaf.php?id=313#me

That’s a new owl:sameAs for my FOAF file. I also have to update my foaf:based_near triple. Until now, my FOAF file stated:

<#cygri> foaf:based_near <http://dbpedia.org/resource/Berlin> .

The obvious new value for this property would be http://dbpedia.org/resource/Galway. But I want to be a bit more specific. That’s where this post’s title comes in. *What is the URI of Rahoon?*

The big datasets: Unfortunately, Wikipedia doesn’t have an article for Rahoon, hence it isn’t in DBpedia.

Then there’s Geonames, the most comprehensive source of geographical information on the Semantic Web. It has an entry for a nearby structure called Rahoon House, but not for the area itself.

The search engines: Next I tried all the Semantic Web search engines from the Linking Open Data project’s list.

Of the seven available services, only three produced any results. I was quite disappointed that the venerable Swoogle, one of the first large-scale Semantic Web indices, didn’t return any results at all.

SWSE (a DERI project) returned two hits, but neither of them was semantic (a web page mentioning Rahoon and a Bollywood-related RSS item).

Sindice (disclosure — it’s another DERI project, and I am now a team member) did much better, it found Geonames’ Rahoon House, and a bunch of loosely Rahoon-related things from DBpedia, but it didn’t turn up any URI for the Rahoon area itself.

Finally there is Falcons, a recently announced project developed at Nanjing’s Southeast University. Its results are similar to Sindice’s, except that it missed the Geonames entry.

In summary, only Falcons and Sindice found anything of interest, but neither struck gold.

Mint your own? At this point, I perhaps have to accept that the only existing relationship between RDF and Rahoon is the fact that Giovanni has been living here for a while, and that I’m not going to find a URI for the place.

The usual advice at this point is to mint a new URI for the thing in question. But I don’t want to go down this road, because I simply do not feel sufficiently competent in matters of Irish geography. What ”is” Rahoon, really? An administrative area? A geographical region? A postal code? A bus stop? I don’t know.

Based near a blank node: My solution is to ignore the widely accepted wisdom that RDF blank nodes are considered harmful. I will state that I’m living near something, and describe that something as good as I can:

foaf:based_near [
    a pos:Point;
    pos:lat "53.27702";
    pos:long "-9.09019";
    rdfs:label "125 Rosan Glas, Rahoon, Galway, Ireland";
    geo:parentFeature <http://sws.geonames.org/2964180/>
];

The skeleton of this N3 fragment was easily created using my FOAF geolocator (which I recently extended with address search and N3 output — have a look at it if your FOAF file still lacks geolocation!). I added a label and a link to the next-largest Geonames feature (Galway), which I easily found with Sindice.

Rahoon is still without a URI, but I guess I should let it be for now and rather worry about applying for a social security number, registering for taxation, and so forth.

DERI! At any rate, I’m happy to be here and look forward to working with a great team on some very exciting projects.

Posted in General, Semantic Web | 12 Comments

LazyWeb request

Posted on April 11, 2007 by Richard Cyganiak

a) A Sudoku solving web service.
b) A Sudoku generating web service.
c) Hook them up to each other.

Posted in General, Semantic Web | 4 Comments

Objectviewer: Yet another linked data browser

Posted on April 3, 2007 by Richard Cyganiak

Via Troy Self’s introduction to the Linking Open Data list I came across the Objectviewer, yet another Semantic Web browser based on the linked data principles. This increases the number of available Semantic Web browser prototypes to four: Tabulator, Disco, the OpenLink Ajax Toolkit browser, and now ObjectViewer.

ObjectViewer has quite a nice visualization of the browsed resource as a simple graph, which isn’t really all that useful in practice, but always makes for stunning demos. A live example on some dbpedia data is here (the browser chrome is missing, I couldn’t figure out how to make a proper direct link), and a screenshot is below.

Webby data everywhere!