Larsblog

Previous | Next

Published subjects and PSIs

Posted in Technology on 2007-01-04 15:25

People often find the basic idea of published subjects quite clear and simple, but stumble over the detail, so I thought I'd write a little overview over the territory. The idea is to sketch out the basic concepts and how it all works.

There are basically two layers to this:

Subject indicators

The TMDM defines the concept of a "subject indicator", which is just a web page that indicates a subject. That is, if a human being reads the page it should clearly indicate one single subject to the reader. The URI of that page can then be attached to a topic as a "subject identifier". (A subject identifier is always a PSI.)

Praying Mantis (Hiroshima, Japan)

An example will make this clearer. Let's say I have a photo, and I want to say that the photo shows a praying mantis. This is straightforward enough: I make a topic for the photo, another for the concept of "praying mantis", and then associate the two. But let's say I want to increase the chances that if I merge my photos with those of other people we'll get a single place to find all photos of praying mantises. I can do that by attaching a likely subject identifier to my praying mantis topic. A good choice might be http://en.wikipedia.org/wiki/Praying_Mantis which is a well-known page that clearly identifies this concept to a human being.

There's nothing about a the web page itself that makes it a subject indicator. It only becomes one when you attach its URI to a topic as a subject identifier. So at the moment the Wikipedia page for praying mantis is, as far as I know, not a subject identifier. But if I really created that topic, it would become a subject identifier for my topic.

What happens in a Topic Maps implementation when you do this is that any other topic that has the same subject identifier (that is, the same URI) attached to it will be forced to merge with my topic. This is why it's important that one choose a page that clearly identifies a single subject; otherwise you might get merges with subjects that don't match exactly.

The Topic Maps implementation doesn't follow the URI in any way; the only thing it does is to compare the URIs as strings. If they are equal, the topics merge. If they are not, nothing happens. And since an exact URI match is required for a merge, you really do want a well-known page with a simple URI. Note that this means that if the subject identifier doesn't actually refer to any web page everything will still work on the technical level. This is not considered optimal practice, but it works.

Thus far the TMDM. The next part of the story comes from the Published Subjects recommendation from OASIS, which builds on the TMDM.

Published subjects

There are several issues with just choosing an existing web page and using it as a subject indicator:

Published Subject Indicators (PSIs) solves (or at least alleviates) these problems. A published subject indicator is just a web page that was created specifically to be a subject indicator. (And a published subject is a subject for which there is a PSI.) There is actually a PSI for April 4, 2000, which you may want to look at.

This is good because a good publisher will

Of course, there's no guarantee that a publisher will be good, but over time the best ones for any given subject should win out. And if there is no PSI for the subject you want already, you can become a publisher yourself.







Similar posts

The web's identity crisis and httpRange-14

URIs are used to refer to both information resources (which are downloadable over the net) and abstract concepts and physical objects (which are not)

Read | 2007-10-08 08:54

TMRAP support in the blog

I've been thinking for a while that it's a pity that many of the stories in the blog which are about the same things as the photos in the tmphoto application don't show up in that application

Read | 2008-01-10 18:09

What is an information resource, anyway?

Robert Cerny asked me if I could write a blog entry on what an information resource really is, since the TMDM has little depth on it beyond the definition

Read | 2008-05-21 16:48

Comments

Marc de Graauw - 2007-01-28 21:59:44

Reading this prompted me to write down some old reservations I have always had about the concept of PSI's.

See http://www.marcdegraauw.com/2007/01/28/the-trouble-with-psis/ for details.

Add a comment

Name required
Email optional, not published
URL optional, published
Comment
Spam don't check this if you want to be posted
Not spam do check this if you want to be posted
> Home
> Technology
> Beer
> Personal

> The author .
> On Twitter

RSS

follow us in feedly

Subscribe by email:

My new book


Gårdsøl
det norske ølet

My other book

Guidebook to Lithuanian beer
Rough guide to
Lithuanian beer

Technology blogs

Robert Barta
TopicObserver.Com
Sveins blogg
Stephen Fry
ongoing
Messages in a bottle
Alex Brown
Planet Topic Maps

Last comments
RSS

Ekta on Bayesian identity re...

Lars Marius Garshol on A sudoku solver in P...

Heinz-GŁnter on A sudoku solver in P...

alex bloom on Active learning, alm...

alex bloom on Experiments in genet...

kenneth mwelwa on 10 tips on presentin...

fadirra on 7 tips on writing cl...

Tim on 7 tips on writing cl...

elmarie on What is an informati...

p2r on 7 tips on writing cl...