15.49, Sunday 12 Jan 2003

Adaptive design for weblog software | I've been thinking a little about software architecture, and primarily how to structure it to encourage two qualities:

  • the evolution of this particular piece of software
  • the evolution of the field

(based on the assumption that no program is going to be ideally suited when originally created, but that an iterative period will cause it to be better suited to use).

The spectrum of software development has two ends. On one end is the push model (yes, I'm going to lapse into the push/pull dichotomy again), which is the model where you set your sights on a goal, and build a tower to get there (like Windows). On the other end is the pull model, which is more like an ecology. Tiny steps, filling niches, each new piece of development just taking advantage of what's already there, and creating new capabilities -- like, life creates conditions conducive to life, in everything that it does. But it's undirected, not goal oriented, and slow. It can't be forced.

I've come to think that the Unix philosophy is towards the pull end. It's slow and steady, and each small piece is selected to be the fittest. Not only are components evolved, but abstraction layers too. But it's slow.

So every so often somebody looks ahead and says "We could do this instead!" so they leap and (in a fill of push development) make something that doesn't fit in with the ecology, like, well, Microsoft Word, or Quicktime, or maybe even XML. Then slowly, slowly, the ecology follows that leap ahead, and does it in its own way. There's a lot of idea-sharing involved, between both sides.

Anyway, I think pull is better, for two reasons. Firstly, because I think the resultant software itself is better. diff, patch and grep (say) are brilliant at what they do. And I think they've got there because they're separate programs, rather than as components of an application. Evolution can't happen to an application, it can only happen between applications. It's application versus application, but imagine how much better it would be instead if you could take the best parts of Microsoft Word and merge them with the best parts of BBEdit.

Trying to evolve a single monolithic application is like trying to order using bogo-sort.

The second reason is that pull is a way of exploring the software landscape. A push model of development by its very nature means developing in a deliberate direction, which means features are limited by the imagination of developers. Exploring with pull on the other hand means the only limitation is how people - any people - combine the parts. Ideas for free, almost.

And I guess this is what people call adaptive design.

Those are the two qualities I mentioned at the beginning. How to design software to itself be evolvable, and be open to other possibilities in the greater software ecology? (Incidentally, I don't think we've had the fundamental improvements necessary in the ecology since diff and patch. XML is a leap-ahead; the ecology needs a better form of structured text, because then we can have solidly grounded zoomable interfaces.)

It's a way of designing software to allow features to be shallow.

Start with weblogs! There needs to be flexibility both in the components, and the abstraction layers themselves between the components (the second is the tricky bit). We can take some hints by looking at what's already in the software ecology. So, some design decisions.

  • For storage, we already have a standard database -- data should be stored in individual posts on the filesystem, in plain text files.
  • Lacking a better structured, extensible text format, the files should use something like RSS. The storage format must be independent of both output and input styles. (Or maybe use email style storage, with headers and body?)
  • There need to be growth points to anchor commenting, trackback, trails, categories and so on.

Although both input and output into the datastore are unspecified here, and many systems will jostle for ways, some suggestions:

  • The input API should either use plain HTTP, like RESTlog, or allow storage on the filesystem without any further gateway (using scp, ftp, copying, moving, or editing files directly).
  • The output should be flexible and so loosely joined that even though there can be multiple templates and so on, there's no concept of republishing.
  • Templated output is a form of cache. Caching must be invisible.
  • Tiny tools should do most of the work, like qmail.

Because of these principles, the system will work with any combination of input styles so long as the output of them is recognised, like the Movable Type text formatting approach.

Okay, so I'm looking at something that involves parts of RESTlog, parts of blosxom. It takes design hints from the way mail systems hang together: procmail for filtering, mbox or similar as a plaintext storage format, simple protocols to view and send mail... (Oh, and now you've read all of this, go back and read the comments in the hyperlink tooltips. There's yet more in those.)

>thinking<

Follow-up posts: