17.32, Monday 8 Jul 2002

The sameness of topic maps and search engines | Have a skim of The TAO of Topic Maps [via haddock]. It's good. Topic maps are a formalisation of sitemaps, but abstracted from the site. They're a standards-compliant way of mapping a domain of knowledge. Handy. Very specific. When there are lots, even better. But better than a search engine, say Google? Don't think so. Each has its place. Calling the search engine big and the topic map small is wrong, I think, even though that's what I'm very close to doing. So: a few gut feelings:

  • Topic maps and search engines are of equivalent size.
  • Scale on the www is not "quantity". Nor is scale on the www a dimension running from domains to sites to pages to posts.
  • Topic maps aren't going to usefully get much bigger than a certain amount. Beyond that semantic differences become a problem, as do updates.
  • Search engines aren't going to get much more specific. You'd need a semantic ontology for the entire www for that. Sorry, but I don't think that's going to happen.

Based on this, we need a new definition of size. The www feels fractal, but not in a containing kind of way. Each distance is equivalent. Domains don't contain pages. Information isn't contained in a single paragraph. References. Interlinks.

I suggest there are three dimensions to maps of the www: Addressing accuracy (to the site, or to the page); Semantic detail (meaning, or words); Wideness (how much of the territory is covered).

Search engines cover vast amounts of the territory (very wide), but don't address it very accurately (only to the page level) and don't have much semantic detail. Topic maps are accurate, detailed, but not very wide. The two are of the same size.

Running with these instincts, maybe there's a limit on the size of maps: Too much detail and the map stops being useful. Ditto wideness. And accuracy isn't helpful if you don't know much about what your looking for. And so maybe the www does work along these dimensions, maybe if each point was locally euclidean in these directions it'd be easier to find things. Maybe we should be thinking of the www not as a territory, but as a vast number of maps, all interlinked, because the distance on maps is more representative of the actual distance we have to travel moving about it.

Update | It occurs to me that there's probably some standard method in network theory for calculating the size of a map/network. Would it be possible to recast semantic specificness somehow so this could be done?