TMRA 2006 — day 2
Posted in Technology on 2006-10-12 09:09
Sam Oh presenting
Day two started right off with two parallel tracks, and I went to the track on "Portals and Information Retrieval", where the first speaker was Sam Oh. He has done a study where they built a Topic Maps-based portal to Korean folk music (known as Pansouri) with a quite detailed ontology of the domain. They then compared how easy it was for 20 test subjects to find information in the traditional portal and the new Topic Maps-based one. 10 subjects used the traditional one first, then the Topic Maps one afterwards, and the remaining 10 did it the other way around. They tested both objective measures (how many steps to find information on 6 assigned search tasks), and subjective measures (how did the test subjects feel about the portals).
The Topic Maps portal clearly performed better on both the objective and the subjective measures, although there were some examples of the traditional portal performing better. In general, the Topic Maps portal did much better on complex queries, but the difference was smaller on simpler queries. One of the most interesting things was that the users reported that they felt the traditional system to be "fragmented", and that it was hard to guess where to find information in the hierarchical systems.
Sam said he wanted to continue the study with more test subjects, and test subjects that are not his own students, such as normal users and also domain experts.
Eszter Horvati of Ovitas Norway was the next speaker, on Compass, which is a Topic Maps-based search tool based on Lucene. They annotate association types to assign a semantic distance measure (between 0 and 1) to all associations of the same type. The semantic distance at two hops is computed by multiplying the distance of the two associations.
The basic search process is that when the user searches for a term the topic map is used to expand the query by adding closely related terms and synonyms to the full-text search. Relevance in the search results is a combination of Lucene's relevance result and the semantic distances. The search result is shown grouped by the related terms that were used from the topic map. Entering a term that is not a topic gives just a normal full-text search.
They use an Excel plug-in as the authoring interface for the topic map, but it's also possible to build a custom editing interface. There is of course also support for the normal browsing by following associations. The Topic Maps support is provided by NetworkedPlanet's TMCore engine. It's also possible to plug in any full-text search engine in place of Lucene.
Henrik Laursen of the Royal Library of Denmark then spoke on a Topic Maps project he's done. They've created a very simple topic map that is effectively a thesaurus. They have a modified Omnigator interface that can be used to browse the thesaurus. The occurrences are links into the search system that perform a search for the current term in the library catalogues, and they've also added links to online sources.
Graham waving his arms
After the break Graham Moore spoke on Topic Map Objects which is feature of the TMCore engine. It allows you to generate a domain-specific object model from your ontology where the data is actually stored in the topic map. This means that instead of Topic objects, you have Person objects, and instead of calling addOccurrence you call setAge.
The way it works is that you create a C# interface and then use C# annotations to define how class and the methods in it map to the ontology you use. One confusing thing is that you seem to have to still implement the actual methods manually, and call out to some TMCore methods. There is also a requirement that your class has to extend a TopicMapObjectBase class.
One drawback to this, as Graham explained, is that it loses the introspective aspect of Topic Maps where you can traverse from the person instance to the person topic type and find more information about the type this way. They want to extend the API a bit so that the domain objects have these abilities in addition to the pure domain methods.
One interesting feature of their implementation is that you can create a domain object and start working with it before it is added to the topic map.
There is also a web service interface, but as far as I can work out it is not related to the Topic Map Objects feature. It works with standard XTM and supports very low-level operations on topic maps, so that you can use the web service to, for example, just set the value of an occurrence. (I'm skeptical to this in general, but I do see that for AJAX this is useful. Robert Cerny has done the same thing in Topincs specifically to support his AJAX-based interface.)
The next speaker was Robert Cerny on his Topincs Topic Maps editor. Due to laptop trouble, his presentation started a bit late. His goal to make software he could use to take notes in a structured form that would be accessibly from anywhere, and on any computer. So, he built Topincs, which is an AJAX-based web application built with PHP and MySQL. It's meant to be used by groups as well as individuals, and as I wrote yesterday it's set up so that people can use it to make notes from the conference.
Further work for Robert is to create specifications for JTM and his Topincs web service interface. He also wants to support specific views for different topic types, rather than the generic view that he has now. He also wants to be able to reference topics from inside occurrence values so that they turn into links in the view. He also want to make it possible to load documents into the system. And, finally, he wants to have version control.
He gives away personal stores for free; all you have to do is email him. He's also set up a number of topic-specific stores on topics.org (soon).
Open space session
The first presenter was Graham Moore who showed a visual ontology editor, as he called it. It's based on Microsoft Visual Studio and defines shapes that can then be used to model Topic Maps ontologies, with topic types, type hierarchy, association types, occurrence types, and constraints and cardinalities, etc. The editor lets you define a PSI namespace in the form of a base URI that all the names in the ontology then get appended to to form the PSI of each ontology construct. Visual Studio can export this to an XML representation, which can then be processed and loaded into a topic map.
Lars Heuer then showed AsTMa= 2.0, which he and Robert Barta have been working on. He called it "creating Topic Maps in pidgin English". The slides were effectively an AsTMa= 2.0 tutorial, which I won't attempt to reproduce here. The slides should be on the TMRA site eventually. What's interesting is that he wasn't exaggerating when he called it pidgin English; the sentence "paul-mccartney plays-for The-Beatles, which isa music-group and which is-located-in London" would actually transform correctly into a topic map. The AsTMa? 2.0 query language allows similar semi-natural language queries, like "$who isa person".
Robert Barta gave us an update on the state of the Perl Topic Maps module. His first point was that the "Perl-XTM" engine reflected his state of understanding of Topic Maps in 2000/2001, and according to him it "sucks horribly". However, he is now updating it, and the new version is (again according to him) much better and cleaner. He gave some highlights of the new features (DNS virtual topic map, TMRM support, AsTMa= 2.0 parser, ...). Unfortunately, there are no slides, so they will not be on the TMRA site.
I then presented Ontopia's DB2TM module, which can be used to do conversions from relational data to Topic Maps, and also to keep the topic map up to date as the relational data source changes.
This inspired Robert Barta to show a similar solution based on AsTMa where "template topics" are declared, which have strings attached as occurrences of a particular type. Inside the occurrence is AsTMa= data with column references (rather like the XML syntax in DB2TM). This can be merged with a SQL query to do a similar kind of conversion that DB2TM allows, but there is no support for synchronization.
Lars Heuer then got up again to show his Python Topic Maps engine, which has a nice feature that allows you to treat the topic objects as dict objects (or hashes, in Perl-speak), so that you can get the age of a person with lars['age'], and assignments work the same way. It also recognizes data types automatically and converts back and forth seamlessly.
As usual, the conference was opened by Lutz, who gave a short introduction based around the conference motto of "Scaling Topic Maps"
Read | 2007-10-11 18:13
Like last year, I will try to do a semi-live blog report from the TMRA 2006 conference on Topic Maps in Leipzig
Read | 2006-10-11 09:20
Stefan Lischke - 2006-10-12 12:08:05
For me it seems like TMObjects is XMLBeans for TopicMaps.
XMLBeans is used to generate a domain-specific object model from a XML-Schema-Definition (XSD). With XMLBeans you are able to use the generated domain-specific methods to modify your model(XML), but you are also able to modify the model(XML) with the STAX-API directly.
translated to TM this sounds for me like using TMCL to generate an ontology-specific object model. You can now modify your model(TM) with the generated methods like addPerson(), but you are still able to modify the model with TMAPI method calls, or calls to the TMQL engine.
Lars Marius - 2006-10-12 13:59:05
Yes, I think this is a good comparison. You could call it data binding for Topic Maps, I guess, or an "object-TM mapping".
And, yes, you can still use the raw engine API to modify the topic map if you want to. Or run a query.
rho - 2006-10-21 01:38:47
The paper/talk got me thinking about the impact on TMQL, or maybe more TMQL implementations. In a query
select $p where $p isa person
one implementor could take the position, that not 'topic items' are returned into the application, but actually objects of class PERSON. A TMQL processor only needs to be configured how to populate an instance of class PERSON from a topic.
This is just another form of 'atomification', i.e. the conversion from a TM item into a primitive data value, (as is implicitely done for basic data types already in TMQL).
Not sure whether this is a (TMQL) standardization issue or only implementation.
Lars Marius - 2006-10-21 10:58:55
I didn't think of that, but it's a good point. In my opinion, if a TMQL processor wants to translate person topics to Person objects, they should be free to do it, but it's on the API level, and has nothing to do with the language itself. Ie: it's not a standardization issue before we do a TMQL API.