The applications of SDshare
Posted in Technology on 2010-11-21 14:29
Graham and Marc Wilhelm presenting SDshare
Graham Moore a few years ago came up with the idea of publishing changes to topic maps using Atom, and a CEN project has now developed and published a specification for it called SDshare. Work is also underway to make SDshare a full ISO standard.
The workings of SDshare are very simple. There are four feeds all told:
- The overview feed
- This feed contains a list of the different collection feeds. Essentially it shows what collections (topic maps) are available on the server.
- The collection feeds
- There's one of these for each collection, and it simply contains two links: one to the snapshot feed for the collection and one to the fragment feed.
- The snapshot feeds
- This contains a list of links to snapshots of the entire collection at different points in time. To start aggregating a topic map or resync after something has gone wrong, download one of these snapshots.
- The fragment feeds
- This feed has a list of links to fragments for each change made to the topic map. The fragment contains all topics modified in that change.
This is enough for a client to first get a copy of the entire topic map, and then to keep its local copy up to date against the remote server by polling for new fragments, and applying the fragments to its local copy. By tagging all names, occurrences, and associations with item identifiers showing which source they came from, the client can apply new fragments without running the risk of losing local modifications to the topic map.
An important property of the protocol is that applying a fragment is idempotent. That is, if by mistake you should happen to apply the same fragment more than once, the effect is the same as applying it only once.
In the Hafslund project, presented by Axel Borge at Topic Maps 2010, SDshare is going to be used to build a unified view of a number of separate information systems. This is done by making an SDshare server for each of the information systems, publishing the the contents of the system as a topic map and a feed of fragments for all changes made in the system.
This is sufficient for a single hub server to be set up, which polls all of the feeds and retrieves the fragments into its local topic map. The result is a unified topic map containing a merged view of all the different information systems. In the Hafslund case this is used to provide automated tagging of archived documents by exploiting relationships between topics, but the pattern is generic and could be applied to a wide range of different use cases.
At the moment, SDshare only has a pull interface, but theoretically one could extend the specification to provide a push interface as well. (A push interface is being considered seriously.) It would basically be an endpoint to which fragments can be POSTed using HTTP. The main use case for push would be replication in environments where the recipient cannot connect to the source, which actually happens quite frequently.
Many Topic Maps portals, for example, maintain two different Topic Maps servers. One is being worked on directly by human editors, while the other is used only for serving the portal in production. The production server is typically updated every night, by importing a new topic map to replace the old.
A much more efficient solution would be for the live topic map server to maintain a list of changes (as SDshare already does), and then to push them to the production server. Pushing could be done either on request, or at specified time intervals. Since this would mean updating only what has actually changed, it would be far more efficient than moving over the entire topic map.
Many projects have more than one server containing the same topic map, for example development servers, test servers, staging servers, and one or two production servers. These don't necessarily all need to have exactly the same topic map, but they do need to have the same ontology. The same ontology changes typically have to be made manually on each individual server, which is tiresome and error-prone.
With the push interface one could make the changes on any server, and simply push them to the other servers when ready.
Conforming to architectural constraints
Some organizations have an enterprise architecture that requires all integrations to go via an ESB or a similar service. For a Topic Maps application to connect directly to a database via JDBC, for example to run DB2TM, would be frowned on in such places. Instead, one would have to put some sort of integration service into the ESB, and run the conversion there. But how to get the topic map into the application? That's right, by exposing an SDshare feed.
As you can see, this is a very useful protocol with a wide range of applications. I'm sure there are more than the ones I've thought of so far.
Graham Moore and Marc Wilhelm Küster presented a new Topic Maps protocol called TMShare at TMRA 2008 this year
Read | 2008-11-08 15:45
While RSS and Atom are a great way to stay up to date on what is published around the web, I think the feed-centric approach taken by most feed readers is suboptimal
Read | 2011-02-03 19:50