|
> Home > Technology > Beer > Personal RSS XTM > The author . > On Twitter Technology blogs
Robert Barta |
Posted in Technology on 2006-03-31 16:17 Lots of people think that the hierarchical association type used in taxonomies is the supertype-subtype association, but this is, unfortunately, wrong. After running into three instances of this misunderstanding this week, I decided to do my bit to clear this up once and for all. It's not really difficult to see how this confusion came about in the first place, given that taxonomies consist of terms arranged in a hierarchy, with the most general terms at the top, and the most specific ones at the bottom. It's the same with class hierarchies, and so it's natural to think that the association type used in both cases must be the same. So what's the problem, then? Well, to understand that, it helps to understand the supertype-subtype association type better. The semanticsThere are three rules about what the supertype-subtype association means, and every use of it must follow all three rules, which are:
Reading the formal definition in all its glory is recommended. Who cares about semantics?You might be wondering why following these rules is so important, and who really cares whether you do. Well, the Topic Maps software cares, because it will believe that you mean what you say. So tolog queries will start producing the wrong answers if you abuse this association type, as will Topic Maps validators, etc. So don't do it. If you are not sure about the relationship you are representing, and whether it really is supertype-subtype, then just don't use the standard supertype-subtype PSIs, and call it something else. You'll lose some functionality, but it's functionality you're not sure you want, anyway. Back to taxonomiesTaxonomies generally do not consist of types, but instead just consist of various terms (body parts, countries, diseases, academic disciplines, and so on), all mixed up. So this alone is enough to disqualify the supertype-subtype association from being used to represent taxonomies. In fact, when librarians construct thesauri (which are effectively a superset of taxonomies) they follow a procedure where they identify the relationships between the terms in the thesaurus, and there is a set of categories of relationships that generally turn into hierarchical associations in the thesaurus. This list includes the supertype-subtype relation, the part-whole relation (such as, Norway is a part of Europe), the containment relation, and so on. So it's not the case that the supertype-subtype association never occurs in taxonomies, it's just that you can't assume that all the relationships in a taxonomy are supertype-subtype relations. So if you want to represent a taxonomy or a thesaurus in Topic Maps, my recommendation is to use Kal Ahmed's PSIs for thesauri, which contain most of what you are likely to need. The hierarchical relationship used in taxonomies (and thesauri) is the one called "broader-narrower", which is just a generic taxonomic relationship stating that the one term is more specific (narrower) than the other, which is more general (broader). CommentsCathy Legg - 2009-02-02 20:32:28 A good paper on automatically distinguishing between super-type-subtype and instance-class relationships is Zirn et al (2008) "Distinguishing between Instances and Classes in the Wikipedia Taxonomy", tho' this is not a paper in topic maps. Add a comment |
Last comments
|