Continuing a thread about views in triple stores.
Leigh Dodds pointed out the need for something like SQL’s views in RDF stores, and suggested vocabulary namespaces as a partitioning mechanism:
The […] subset may be created by filtering out the classes and properties extracted from the database based on their namespaces. For example I might have a triple store containing a mixture of public/private data, with the latter in a separate namespace and I want to pull out just the public aspects for returning from a web service.
I replied, somewhat off-hand:
I think this is not such a good idea. Namespaces are just that, namespaces. Don’t overload them with access control.
I’m not sure how he’s using the term “access control” here, that’s not what I was suggesting. And namespaces are intended to partition vocabularies, so not using them as a way to ignore data that’s not of interest seems bizarre to me.
First, Leigh, I do agree with you about the worth of views, especially to protect applications from schema changes.
Yes, namespaces are for partitioning vocabularies. But along what kind of boundaries? I think that the best boundaries are semantic. Classes and properties about persons go into one vocabulary. Classes and properties about computer software into another.
Your suggestion is to use your application’s publishing policy as a boundary. Stuff intended for public use goes into one vocabulary, the private backend data into another.
As long as this distinction somewhat coincides with a semantic boundary, I can see no fault with this approach. But otherwise, there are a number of downsides:
- Changing your publishing policy gets harder. If you decide to make some formerly private data public, you have to move properties between vocabularies, a rather expensive move.
- Having more than two views, with some overlap, becomes difficult.
- You reduce the reusability of your vocabulary. Other parties might want to re-use your terms, but not your publishing policy.
In SQL, views can be created, changed and removed without affecting the underlying database schema and data. That’s their whole point. With the namespace-as-view approach, changing a view means changing the schema (vocab term URIs) and the data (triples using the term).
Of course, I don’t know anything about the use case that prompted Leigh’s original post, and everything I say might or might not apply to the specific circumstances. I’m just voicing general design opinion here. So as long as it gets the job done …