Topic Maps, RDF, semantics, merging ...
Posted in Technology on 2008-07-29 16:54
Merged creature, Moss, Norway
Michael Sperberg-McQueen wrote a blog post on RDF and Topic Maps where he brought up some interesting questions. I started trying to reply in a comment on his blog, but after writing two full pages I decided to instead turn it into a blog posting here.
Michael started out by considering whether one of XML or RDF could be said to have more intrinsic semantics than the other, in the sense that one would (in the absence of any knowledge of the vocabulary used) tell you more about the meaning of any data set you find. Michael doesn't really find any significant differences, but I disagree with him on that score.
I find one significant difference: in RDF the world is segmented into a definite set of objects (resources, subjects, entities, whatever), whereas in XML it is not. For example, if you look at RSS-in-RDF there is no doubt that there is one object for the channel and one per item, whereas if you look at RSS-in-XML it's just a bunch of elements. This means that mapping from RDF to other data representations, such as Topic Maps, databases, and object models is much easier than with XML, to take just one example.
Topic Maps have the same advantage over XML, but then goes on to add some that RDF does not share:
- You know the name of every object. This is human-oriented, practical semantics, rather than the sort of semantics logicians would recognize, but no less useful for that.
- You know which objects are information resources and which are not. The same comment applies here.
- You know which objects are n-ary relations and which are not. (You can express n-ary relations in RDF, but they cannot be distinguished from ordinary objects afterwards.) It could be argued that there is no real distinction here, and while this is strictly speaking true, the distinction is important enough to have merited at least one W3C note.
- You know the context of each statement. You may not know what it means, but you can at least see if there is a limiting context, and which ones are different/the same.
Continuing this discussion Michael eventually comes to a discussion of RDF's requirement that statements must be atomic and semantically independent. That is, you can't ever assert the sentence P as part of a predicate calculus statement about P. The rationale for this is that it must be possible to merge RDF graphs without having parts of one graph change the meaning of parts of another graph. (They may make conflicting statements, but this is another issue entirely.)
Michael thinks this is "inconvenient and opaque", and asks whether Topic Maps has the same restriction. The answer is actually "no", but I'm not sure how much of a difference this makes in practice.
Firstly, I think explaining this rule to people using the technology is in most cases going to be impossible, and so even though RDF does have this rule I wouldn't expect RDF users to be aware of that, or to abide by it if ignoring the rule would get them to their goal quicker.
Secondly, since adherence to this rule cannot be verified automatically I would assume that there would in fact be RDF users violating it without anything much in the way of immediate adverse consequences. So although words to this affect may appear in the spec, the practical consequences of those words may well be nil.
On the other hand, this is clearly a best practice for use of both technologies, since violating it can indeed lead to serious problems in merging, regardless of which technology you are using. If, that is, you actually do this, and I think for most practical applications this issue would most likely not crop up.
I think this is one of the cases where it's better to let users do what they want with the piece of rope they've been given, and if they do develop difficulties breathing that's just too bad. But then it does seem that this is pretty much the state of affairs anyway.
Quints and quads
Divers, Oslo city centre
Michael also asks whether something like my Q model would be able to solve this problem. The reason he brings this up is that part of the problem with the previous restriction is that in order to make predicate calculus statements you need to speak about statements without asserting them as true. That is, you want to be able to say "if P then Q" without actually at any point saying "P".
The trouble is that replacing triples with quads or quints would not really make any difference to the ability to speak about a statement without having to make that statement. In fact, although the RDF representation of reification is justly reviled this is one thing it actually does get right, since with this representation one can make an RDF node representing a statement P without P actually being asserted.
Topic Maps cannot do this at present. One could of course use a vocabulary similar to RDF's reification vocabulary, which would make this possible. I'm not sure how much value this would have unless it was standardized and supported throughout the technology stack. It might conceivably work just fine without being standardized.
Michael also asks whether there is "a short, clear story about the relation between the kinds of things you can and cannot express in RDF, or Topic Maps". That is a good question, and as far as I am aware the answer is "no". This would clearly be a good thing to have, but it would require a bit of effort to put it together. I'm not really up to that right now.
URIs are used to refer to both information resources (which are downloadable over the net) and abstract concepts and physical objects (which are not)
Read | 2007-10-08 08:54
The Topic Maps and RDF technology stacks are quite complex, and it's not easy to see how the various pieces compare
Read | 2007-01-06 18:52
Brett Kromkamp - 2009-10-03 18:09:41
I am currently developing a topic map-based knowledge management and repository application with Adobe Flex/AIR: http://www.quesucede.com/page/show/id/polishedcode