Blog
Books
Talks
Follow
Me
Search

Larsblog

TM/XML

<< 2005-12-03 16:35 >>

The design of TM/XML, first heard of at TMRA'05, has now at long last been finalized, and the paper about it sent off to the publishers. I figured this was a good time to spread the word a little more, so here is a short introduction to TM/XML.

The best way to think of TM/XML is as a kind of LTM-in-XML, in the sense that like LTM is more human-friendly than XTM, and also like LTM it is not in any way standardized. It's just a proposal, to be implemented in Ontopia software, which anyone else can implement if they want to.

The reason it was developed was that we are creating a web service interface to the OKS, and the idea is that this will be used by clients which do not use Topic Maps software to access the TM server. There is no problem with that, except that these clients are not likely to be very happy about having to send and receive XTM, since XTM requires you to understand Topic Maps, and is pretty difficult to process with XSLT.

So this is where TM/XML comes to the rescue, by providing a nice, natural XML syntax for Topic Maps data. It's meant to be used in fragments passed to and from web services, but can also be used for entire topic maps. For those editing their topic maps in XML editors, I imagine this would be enormously much nicer than XTM, especially as they can create their own DTD which incorporates some of their Topic Maps schema rules.

Note that it is not the intention that TM/XML should replace XTM. For interchange between Topic Maps implementations XTM does the job just fine, and there is no need to replace it. TM/XML is all about extending the reach of Topic Maps and making it easier to use Topic Maps in contexts where they were not used before.

But enough generalities. Here is an example, a fragment showing the Puccini topic from the Italian Opera topic map. It's been cut down a little to get rid of some of the repetition, and I cut the namespace declarations because they are so long and tedious.

<topicmap ...>
  <music:composer id="puccini">
    <tm:identifier>http://en.wikipedia.org/wiki/Puccini</tm:identifier>
    <iso:topic-name>
      <tm:value>Puccini, Giacomo</tm:value>
    </iso:topic-name>
    <iso:topic-name scope="basename:short-name">
      <tm:value>Puccini</tm:value>
    </iso:topic-name>
    <iso:topic-name scope="basename:normal">
      <tm:value>Giacomo Puccini</tm:value>
    </iso:topic-name>

    <biography:date-of-birth>1858-12-22</biography:date-of-birth>
    <biography:date-of-death>1924-11-29</biography:date-of-death>
    <opera:webpage datatype="http://www.w3.org/2001/XMLSchema#anyURI"
           >http://www.r-ds.com/opera/pucciniana/gallery.htm</opera:webpage>
    <music:sound-clip datatype="http://www.w3.org/2001/XMLSchema#anyURI"
           >http://www.puccini.it/files/vocepucc.wav</music:sound-clip>
    <!-- cut more occurrences -->

    <biography:pupil-of scope="psi.ontopia.net:biography" 
                           topicref="bazzini" 
                           role="biography:pupil" 
                           otherrole="biography:teacher"/>
    <biography:born-in scope="psi.ontopia.net:biography" 
                          topicref="lucca" 
                          role="psi.ontopia.net:person" 
                          otherrole="geography:place"/>
    <!-- cut more associations -->
  </music:composer>
</topicmap>

The basic idea is that the elements that represent topics, topic names, occurrences, and associations are created from the types of these constructs. We form namespace names from the subject identifiers of these types, which gives us music:composer for Puccini, for example. The result, as should be obvious, is much more XSLT-friendly than XTM is.

The structure of the names may look a little funny, but the iso:topic-name PSI identifies the default name type for topic names, since the names in this case have no type. The tm:value element inside it is necessary in order to support variant names cleanly.

The references in attribute values follow the same approach as the element type names: QNames are used for PSIs, and simple IDs are used if no PSIs are available.

So, if you think this looks nice, and you want to use it, what can you do? Well, the syntax is there, so you can start writing/creating topic maps in it straight away. There is a RELAX-NG schema for the syntax to help you, as well as XSLT stylesheet for converting from TM/XML to XTM. There's also more information in the slides from TMRA'05, although they describe a slightly earlier version.

TM/XML import/export will be available as part of OKS 3.0, and therefore also as part of the Omnigator free download. Finally, if you want to implement support for TM/XML, send me an email, and I'll do what I can to help you.

Similar posts

How to write a TM/XML deserializer

The TM/XML syntax is easy to understand for humans, and easy to process with XSLT, but seeing how to write a TM/XML deserializer is not trivial from the spec

Read | 2006-08-02 16:56

TMRA'05 — second day

(Second day of semi-live coverage from the TMRA'05 Topic Maps research workshop.)

Today we again start with Jack Park, this time speaking on "Just for Me: Topic Maps and Ontologies"

Read | 2005-10-07 13:33

TMRAP support in the blog

I've been thinking for a while that it's a pity that many of the stories in the blog which are about the same things as the photos in the tmphoto application don't show up in that application

Read | 2008-01-10 18:09

Comments

Alex - 2005-12-07 00:56:16

Any reason you've left out the juicy PSI namespace declarations? The biggest problem I see with this is going the other way (from XTM to TM/XML) because you have to be namespace aware.

Also, I'm curious about the construct ;

 <iso:topic-name>
    <tm:value>Puccini, Giacomo</tm:value>
 </iso:topic-name>

Is this because it's just a simple remapping of XTM? Also, it isn't clear from your description, but with ;

   <biography:pupil-of ... />

I assume again this is a short-hand encapsulation of associations? Are there rules for serialisation and unpacking of these constructs?

And lastly, couldn't there be predefined versions of a lot of typical URI datatypes? datatype="http://www.w3.org/2001/XMLSchema#anyURI" could be replaced with a href="[data]" etc? It certainly would appeal to the XML and HTML crowd.

Apart from that, I like it a lot. I can't use it for a lot of my automata, but for hand-crafted stuff it's really nice.

Lars Marius Garshol - 2005-12-07 10:53:54

I left out the namespace declarations mainly because they take up a lot of space.

Not sure why you think being namespace-aware is a problem?

Not sure what you react to on the name construct. The tm:value element? That's there so that variants can be added cleanly. You could leave it out when there are no variants, of course, but that would mean that any XSLT would have to test for this when processing TM/XML, and it just didn't seem very clean. iso:topic-name is just the PSI for the default topic name type in TMDM.

The biography:pupil-of element is a compact rendering of a binary association, yes. TM/XML also covers unaries and n-aries.

The serialization rules for TM/XML exist, and are written up in the TMRA'05 paper.

The URI datatype question is a good point. I'll think about that one.

Glad you like it!

Roger Sperberg - 2006-06-28 01:20:51

While the paper describes the serialization process, at times I'm wanting to see the reverse -- the unpacking -- described, so as to make sure I'm understanding what's going on.

More examples would help (as I've written to you to say :-).

-- Roger

Add a comment

Name required
Email optional, not published
URL optional, published
Comment
Spam don't check this if you want to be posted
Not spam do check this if you want to be posted