All posts made in Aug. 2004:

Another square of white shiny paper, 4 millimeters on the side again, same as last time. I stood up from the armchair in the front room early this afternoon and found it face-up on the cushion. I had been sitting on it. This one reads: 16. The ink is bright red.

Dead leopard in the garden, behind the bush, outside the patio room window. Sad. The yellow hairs are almost ginger, it's eyes filmed with blue opals. No visible marks. How did it get there? Is it dead? I went to pick it up, its consciousness was still leaking out, going to ground. As I was holding its head, it earthed through me; the muscles in my legs still feel tingly and taut with leopard thoughts.

My neck itches. There are soft bristles on my back. The world is flattening, becoming permeable to me, hedges and trees receding into themselves. Now, inside the house, it feels like part of me is trapped, like I've got my sleeve caught on something. I panic. As I step outside I breathe out, and it's a breath that goes on forever, as I relax and soak into the grass, into the air. I spread as far as I can smell, feeling the scent texture lying over the landscape. I lounge like a puddle. I focus by trotting my body over. The bristles are getting longer. Then they stand up.

Next thing I remember there's hair tickling the roof of my mouth, and there are sinews scraping between my bottom teeth. Tepid meat in my mouth, heavy like a second numb tongue, the weight of the leopard's head borne by my jaw, tugging my canines as it swings, lazy blood on my chin and in my nose. Gloopy. It crackles slightly as it dries. The sky is hot and blue on my naked back.

Dead duck in the garden, behind the bush, outside the patio room window. Sad. Yellow/orange feet and bill, blue stripe along the wing, surprisingly light. No visible marks. Is it dead? Or is it slowed down, the knot of its consciously moving at one millionth time, distributing into the soup of the universe.

My neck itches, and just now, I scratched it, and picked up, under my fingernail, a small shiny square of paper, 4 mm on the side, with black numbers printed on it: 20.

What do these things mean? A duck's dreams. My finger, the sticker: a synapse. And now I'm thinking about it, part of the loop. Engaged, embedded.

Diego Doval's atomflow. Go get, go play. Command-line tools for manipulating and querying an Atom datastore. And yes, as Ben says, simple, powerful, and our first fruit of EuroFoo.

So here's the objective: Blogging tools are monolithic. The same tool that deals with capturing the metadata also deals with user management and templating, and all the other integration features that get thrown in too. It doesn't have to be like that. The people who make great templates should also have to be the people who know what metadata to keep (last modified date? author? what sort of categories?), and everyone who writes a blog store should be able to benefit from the great templates that are written.

I've talked before about why I like Atom. It's because it's the fixed point around which all the rest can crystallise. The thing is, software decisions about what metadata to keep, about where to draw the abstraction layers, and how to design the interfaces: these things are really hard. We can attack the problem in two ways. Firstly, the design of Atom and Atom API codifies a number of these design decisions. Second, for the problems that haven't been decided yet (like how to query the datastore to get something that can be templated back), we take a best guess at how it should be done, and also make sure the system is evolvable so it can adapt as we learn.

See also my old post on adaptive design for weblog software. I've been working towards this myself, a little, but not using Atom (it didn't have a stable draft at the time of writing), and that's what I was demoing on Saturday (to Diego and Ben), after Ben's session and us talking with the whiteboard (which I wish I had a photo of).

The requirements then, given the above, are this:

You need a command-line tool to implement the Atom API. You could call it like

  • # cat entry.atom | atom-api
  • # cat entry.atom | atom-api --method=service.edit --post-id=
  • # cat entry.txt | make-atom --author=X --categories=A,B,C | atom-api

Then around that you can add an HTTP wrapper for your editing tools (or maybe it implements an HTTP service itself). The editing tools deal with user management (maybe integrating with permissions). Notice that atom-api is a command-line equivalent of the Atom API implemented as a REST web service, and we can cat other tools onto it to convert into Atom Entry format.

This tool is absolutely essential, the core of the whole system, and it can be this because everything is tightly specified.

The next tool in the stack is to query the store, to take different slices.

  • # atom-slice --day 2004-08-24
  • # atom-slice --latest 15

And this would output an Atom Feed. You need a slice for every way you can access your weblog: Day archives, month archives, by category, by keyword, by post. The interface for this tool is less tightly defined, but it can basically grow the same way as grep has: continually being refactored, made more powerful, faster, more flexible, but essentially stay the same.

Last we need a templating tool. This is nothing more than a tool that takes an Atom Feed and some XSLT and produces HTML.

  • # atom-slice --latest 15 | transform --xsl blog-template.xsl

These is the least well defined. I suspect we'll evolve, then specify some kind of templating standard: what exit codes are required to return a 404, how to add information about the archives from the datastore and so on. But that's fine, so long as it works as first, then it can be refined and replaced later.

The system I've been working towards myself (that I was showing to Diego and Ben on Saturday) is really simple. The components are:

  • Inject script to convert a weblog post from plain text into the xml format required for the datastore, and to store it (or instead of cat, use the pbpaste command): cat entry.txt | winject
  • Sync script to move my local weblog store up to the webserver: wrsync
  • Slice scripts to pull a collection of posts from the datastore: --day=2004-08-24 (or -n15)
  • Then finally we get to the web view, where there are cgi scripts to handle a query like week.cgi?date=2004-08-24, which just fetches some posts (using the command-line slice tool) and applies some XSL.

It's having built this that I believe that saving an entry to the datastore is a separate job from converting plain text into an entry. It's also this which leads me to believe we need a standard (extensible, adaptable) interface for making slices across the datastore (because of the requirement to add other weblog feeds into the mix, archive information and so on: the front page of a weblog isn't just a transform of a feed, it's a transform of a compound document).

So where will atomflow lead? I think the fixed point of storing Atom documents in a store is the most important, a command-line implementation of the Atom API is where it should lead. It's the kernel of the whole system. Next is the slicing script and the transform standard. With these two we can have:

  • A front-end cgi and manager that lets you pick and choose different templates
  • Shared templates in XSL
  • Drop-in transform script replacements that understand, say, some Perl templating language instead of XSL
  • A stand-along script to transform the day's posts to email and send them out
  • Separate user-management and posting tools, integrating into different packages, or some different markup language or something

Then there are comments, category management, tagging and so on. All complexities that can be built on the same foundation, or an evolved version thereof.

I can see packaged systems that specialise on templating systems that distribute these command-line tools transparently, so people never know they're there... until they want to install a new indexing and word burst association tool and it just works.

What this feels like to me is a system open to change. Current installable weblog systems don't feel like this. Okay, Movable Type's APIs are really good and provide for a scriptable and reasonably flexible system. But we shifted from RPC web services to REST as an architecture for flexibility, from API to loosely joined. And command-line tools connected by pipe are loosely joined too, and not as flexible as APIs. We're just taking the strength of the Atom stack, which goes upwards, and pushes it back downwards, behind the scenes. And that's why, because this is what I care about, I like Atom. RSS doesn't operate in this territory--and to be fair, it doesn't need to, and shouldn't. You add complexity to get this type of functionality, RSS's simplicity is avoiding this. RSS is simplicity for feed producers; Atom is simplicity for feed consumers, entry consumers, transformed entry consumers, entry and feed processors, everything.

Anyway, atomflow. A fantastic first step. I outlined above what got scribbled on the whiteboard. Diego's atomflow is already different, because it's evolved and he's taken into account pragmatics and what's useful. That's good, the more minds, the more changes, the better. And here's the thing: Diego wrote it in a matter of hours! It's as fully functional as we need, already! He's thought hard about the interfaces and now nobody needs to rethink that design, it's already good, and importantly, calcified into the code. And listen to what he says, about how he's using it already: So I built a scraper for that takes their stories on the frontpage, and outputs them to stdout as Atom entries. Then I pipe the result into atomflow, which allows me to query the result anyway I need, and, more interestingly, subsequent pipes to atomflow calls that can narrow down content when the query parameters are not enough. Awesome.

I'll be transforming my weblog store to Atom Entries and my templating to work with Atom Feeds very soon and running with it. Let's see what happens. I'll add the extra command-line tools I need, and release those back out again. A thousand flowers, etc.

Questions, the main thing about EuroFoo was people asking questions of other people. Everyone I met was doing startling things [seriously startling things, really] and happy to explain it to me, and to feed back additions and new angles on what I was doing too (which was rather unnerving, as everyone had more ambient knowledge about what I've been working on than I do, and kept on giving me new ideas). But questions, all over the place. Often suggestions and speculations, but always questions. And the sessions were really hour-long answers to the basic question "so, what are you interested in?" but in a conversational way. Lots of interaction (and more questions), and follow-ups in the bar (with more questions). So much sharing. Wonderful, what a crowd. So many people who have made things, I feel rather underachieving. It's motivating.

(Later: Aha, EuroFoo is a meatspace wiki. That's it!)

In other news, I have mosquito stigmata. I was bitten in bed, Sunday morning, once on each of the pulse points on my wrists. They itch like billy-o.

EuroFoo? Just got back, thanks. How was it? Best weekend I've had in ages. Huge amounts of fun. Mainly, three things: Loads of great conversations (brains, 3d printers for shoes (that is: to print shoes, and instead of shoes, like a superpower), and something about a mouse with the legs of a spider). Also, I spoke (5 minutes prep after being distracted in the bar all evening and an hour of bouncing round infront of a projector screen. I had fun, and I hope I made a tiny bit of sense to everybody else). Third, the student bar opposite with their cheesy euro dance that took me straight back to 1996 and the good old days. I haven't had a good dance like that for a long time. Tangent: check out Digitally Imported for streaming radio of this kind of thing.

I understand photos exist somewhere. I've not seen them. Ah.

So. Hello to everyone I met. Please mail me with whatever I asked you to remind me to let you know about! (Old friends and new friends too, too many to mention.) There's some fun stuff bubbling away in the syndication and blogging technology world. Got some great ideas, really workable, evolvable, exciting ideas going with Diego (who got code going too!) and Ben (also some novel ideas for syndication in games and interactive, long timescale state machines here).

Last things. I rediscovered my hat for this camp which has been with me now for nine years and has some wonderful memories associated. And then I left it on the tube, which had me truly miserable for a half hour until Phil said he'd rescued it. That was on the way back. Also, big shout out to my friends, the ducks. Some of them them bark like quiet, breathy dogs. Lastly, thanks to Tim O'Reilly for kicking this off, the team who organized (especially Gina who printed demos out for me at close to midnight on Friday), and everyone for being there, running sessions, and sharing ideas. Oh, and for not publicizing the photos of me dancing around with my hands in the air. Right?

I am thinking it's a sign that the freckles/ In our eyes are mirror images and when/ We kiss they're perfectly aligned/ And I have to speculate that God himself/ Did make us into corresponding shapes like/ Puzzle pieces from the clay

Such Great Heights, The Postal Service. Listen online.

I am sitting here at Euro FOO, demonstrating code over lunch.

I'm looking for the statements on the cards in Velten, E. Jr. A laboratory task for induction of mood states. Behav Res Ther. 1968 Nov;6(4):473-82. If anyone has them, could you send them my way please?

This is about Atom, weblogging systems, and loosely coupled systems, but first, why I like RSS, and use it now:

  1. It's supported by everything
  2. It's really really simple

When I use RSS, I use RSS 0.91. It's my guarantee that the feed isn't going go any further, and - so long as you can see it - you're not missing anything. What you see is what you get. I haven't seen anything that competes with RSS 0.91 for simplicity, or in its niche.

But RSS 0.91 is an output format, it's not an abstraction layer.

Why I like Atom:

Atom is an abstraction layer. I like Atom because, of all the weblogging systems I've seen, and most of the news systems too, it demands the minimal set of metadata you need to do useful operations.

Demands: Because Atom demands last-modified as well as a created date, that metadata has to exists. A pita the first time I encountered it, sure, just like the post title (which still annoys me). But what Atom says is: If you want to play the weblog game, this is the metadata you need. That, and some other stuff (it's all in the Atom spec).

What Atom is is the shape of the post. Just like email. Email isn't IMAP, or POP, or procmail, or Outlook, or SMTP, or Mail::Box, or mbox or Maildir, or any of the things that consume, produce, store or route email. It's RFC 2822, and email, for all its faults, has been enabled by saying: Headers look Like: This, you gotta have certain headers, and this is what a date is.

This is the way the world works. When it comes to chairs, there isn't some authoritative chair, and all chairs are like that chair, and some arse to sit on it, all arse instances being somehow like that arse. No. There's the act of sitting. There are things which are chair-like (which afford sitting upon) and things which are arse-like (which afford sitting with).

What Atom says is: If you want to afford these behaviours, then you have to be weblog-like in this respect and editor-like in this respect, and it doesn't matter what else you do, but those are the shapes we require.

And because Atom demands this certain minimum set of data, which has been decided by experience and thought, you can use [will be able to use] any editor, any weblog system that accepts posts, any templating system. What I'm describing here is the loosely coupled architecture, and what the benefit is, is evolvability.

We've had a lack of innovation in the weblog space because there aren't decent abstraction layers. You can inject posts, that's it, but bound together are: the datastore, templating, display, editing, permissions, commenting and so on. Occassionally there's some innovation -- there was a website that sold Blogger templates for a while, I believe, and MovableType plugins extend the templating and display systems wonderfully. But the applications themselves are still monolithic.

Atom says: Given everything else is in flux, here's our one fixed point. The post. The metadata of the post. The relationships between posts. That's it.

Out of that single abstraction everything else can crystalise.

I can see a templating system spin out. The templating system for my weblog is a few lines of Python (calling libxslt) and some XSLT files for transformations. It needn't be -- Movable Type's biggest asset (in my opinion) is the community and plugins surrounding its templating system. I'd like to be able to drop that in instead. One day I hope I'll be able to.

The editing system will also spin out. But whereas currently it's a little hackish - the weblog editing tools are crippled by what the API offers - the Atom specification has been designed to allow fully functional editing tools. I'd like the functionality and simplicity of the old Blogger interface using the Atom API to post to my weblog. And sometimes I'd like to be able to post to it using a desktop app. And have a custom weblog search tool running on my desktop that's indexed my weblog (using the relationships in the posts and the Atom API) to assist me.

One day I hope I'll be able to.

Here's how I see a future weblogging system:

  • A datastore that holds the posts, possibly the file system
  • A templating system. Probably a cgi script that does XML transforms (so I can trade templates with people, because we all know we're operating on collections of Atom documents), or a Movable Type-like templating tool. Optionally a cache.
  • An inject script that implements the Atom API, allowing different tools to create, edit and manage posts
  • Off-the-shelf editing tools, aggregators, indexers and so on.
  • Drop-in tools that can be consumed by the templating system that produce side-bars, commenting and so on.
  • Maybe a standalone publisher that uses the templating system to generate static files and SFTP them to a remote location.

After the Atom API, a specification to do comments and another to do templating should probably be next. (The Atom API is a really big deal because it allows evolvability in a space where we already know there's demand. It's fairly easy to implement too, see the Atom API for a good case study. I do think templating will come next.)

There are tools we can't think of yet, which may build on the routing, or the linking, or any other corners of the Atom shape we haven't really exploited yet. (The Atom Entry is standalone in a way the RSS Item isn't.)

It'll be good, to see the end of monolithic weblog tools, and evolvability in its place. This is what I think Atom allows.

(I'm making a big assumption here, that design decisions and understanding the problem space are a bigger block to people getting involved than the ability to write code. Atom encodes an understanding of the problem so that people can make interoperable tools that concentrate on features, instead of thinking about which particular bits of metadata should be kept, and how to structure posts.)

A weblogging system made out of a bunch of parts like this sounds really technical and hard to install, but it shouldn't be. Free software should allow bundles to be made so installing is just a matter of making a couple of directories and dropping in some cgi scripts, as it is at the moment. But changing the templating system should be a matter of just pointing it at the same datastore. And an editing component that looks after user management should be drop-in and replaceable, in much the same way as you can switch between different webmail applications if you install them on a server to look at an IMAP mail store. (I guess what I'm talking about is the benefits of commoditisation. If you buy Joel on Software's argument that successful products have commoditised their complements (the other products and services you need to make use of them) then desktop weblog editing tools (ecto, basically) need to commoditise weblog publishing engines, and front-ends need to commoditise back-ends -- which is what Blogger did (accidentally) by publishing with FTP to begin with.)

A little more on RSS:

What RSS 1.0 has is machine-readability. Its problem wasn't complexity, but complexity without purpose. It hadn't been tested against the requirements of tools.

What RSS 2.0 has is no guarantees. It can't act as a fixed point (in my opinion). It's simple enough for publishing but given namespaces and the amount of optional elements, I can't be sure that if a tool supports the minimum spec I have everything I need to move between many different tools. It's as if RSS 2.0 is defined not by the spec, but by what tools that consume RSS 2.0 do with it -- much the same difficulty as Microsoft Word documents. Word docs are whatever Word chooses to see, regardless of what OpenOffice or other consumers do with them.

So I think RSS 2.0 will become just an output format, like a print stylesheet. It's not recombinant enough. But for me, RSS 0.91 already occupies that space.

Whereas with Atom, everything is tightly defined. Like email, I'd feel confident opening my post in lots of different tools and knowing it's just different views of the same thing. With RSS 2.0 I can't quite tell.

(And on the politics of the situation: My preference is for particular landscapes rather than technologies. I like the landscape that Atom would enable+encourage, but that's not to say RSS couldn't do the same, if it wants to. I'm not sure it does want to, it's already extremely well optimised for the landscape it wants (it's enormously popular and really well supported, no more evidence needed), a different one. So I'm not interested in politics, it's horses for courses. When Atom's finalised, I've still got applications I'll be making new RSS feeds for.)

What I'd like from Technorati is for me to be able to plug into them, loosely joined, as part of the ecosystem of the www.

I'm at Hypertext 2004 seeing all these people who can apply cool algorithms to linked nodes, and know how to display them. That'd be great to do with weblogs. In fact, it's ideal territory. The problem is this: First, you need loads of people trying different algorithms on anything new like this to see what works and what doesn't. Then everyone publishes (puts a website up), the best are copied, and we iterate. But the problem is that spidering and parsing blogspace is really hard. Like, so hard that people do the spidering and parsing stuff, extract some links, and then they're worn out, they stop. And mostly people don't even get that far, outside research labs (I met a guy from IBM with a local copy of the web).

Feature request #1: Figuring out permalinks is hard. Context is hard. Sorting a weblog in posts is hard. So let's not bother. What's the lowest-level thing that anybody would get value from? All we need is the url of a link, the link text (and title, if available), the url of the weblog on which it was found, and the time. That's all, Technorati have it already. I'd like a pubsub mechanism to get that from Technorati, a data stream, like the data stream of updated weblogs at, that other services can sit on top of and build on. I'd like to be able to set a script running on my computer and pipe every single link found, and when and where it was found, into a database, for analysis. To enable the large numbers of people hacking that evolvability requires. But, you say, that's Technorati's USP, to give timely search results! Fine, do what the stock markets do: Delay the data by 24 hours, it doesn't matter. But give us a stream, and be part of the ecosystem of the www.

What'll happen? Who knows. But when I've looked at analysis done of the way links move round blogs, it's the "links spread like infections" story most times. We could look for other patterns. We could have interesting visualisations. We could identify blogs that move in the same circles (because they get the same links in the same kind of time order), or see communities of interest. I don't know, and that's the wonderful thing. What I do know is that the people with the maths and the algorithms and the expertise to do this don't have the corpus of data to operate on. And that Technorati do have this data (already!), and can give it away so we can loosely join to them.

I've also been looking at XFN, a way of marking up links according to whether the person at the other end is a friend or collegue or whatever. Distributed data for a social network, basically. Tantek Celik and Eric Meyer are here, presenting it. It has a controlled vocabulary (accepted words, basically) which you can use to describe your relationships with people. But controlled vocabularies make me uncomfortable for some uses (this is one), and why stop at people? I'd be more interesting in getting more value out of the links we post. See above, in other words.

But I'm also looking at and seeing that people sort and tag their links there in a wonderful way: the tag vocabulary is bottom-up. By giving immediate use to metadata, people add metadata. By giving use and visibility to the general usage of metadata, there's an incentive to converge on the same vocabulary. Great. But it's still a closed system, we can have a thousand algorithms blooming and so on. So, let's merge XFN,, and feature request #1.

Feature request #2: Let's have a web-wide Let's define that, for every link I post on my weblog, I can add an attribute tags, and I'd use it like: tags="wayfinding brain"

All I want blogging systems to do: Let me attach those tags to all links, just as does. They will, if people want it, or we'll do it ourselves with plugins.

All I want Technorati to do: Push those tags out in the stream with the url, link text, blog url and time. That's all. It's a regular expression away.

What I want someone else (or me) to do, which they (or I) will: clone the interface for this stream. It won't work for some blogs, fine. But it'll work for many. Give visibility to the tags of people I'm subscribed to, to provide a pressure for us to converge vocabularies (bottom-up, see). It's useful because I can have easy access to my own links, and everyone else's links. But it's merged with what I already do, all the feedback loops are in place. It should just work. And the bottom-up categorisation of the www can begin, only useful for each of us, each step of the way. (imagine, automatic deductions that say "sci-fi is usually used with books", a look category hierarchy that is fluid and handy. Opinion, fact, all the rest. Pragmatic).

Why we don't have these things already? Because parsing blogs is hard. But some people do it already, people like Technorati. Technorati have bootstrapped off the ping mechanism that made the first wave of blog tools - aggregators - possible. Now it'd be great if we could bootstrap off it.

I'm in the Bay Area next week for HT04. Noah Wardrip-Fruin has more about our blogging tutorial (come along if you're able). And Tom Coates suggests a get-together at the Tonga Rooms. Tiki and cocktails on the evening of Saturday 14th? Should be great fun.

I'm writing a book. Here's what I've been doing since May: Co-authoring together with my friend and cognitive neuroscientist-at-large Tom Stafford a book for the O'Reilly Hacks series (with our editor, the mighty Rael Dornfest):

Codenamed, Brain Hacks.

It's all about, well, let me dig out our original pitch. It's: 100 practical and understandable probes into the design quirks of the brain, concentrating on the sensory and motor functions and their coordination.

And this is the motivation: To get where it is, the brain has made some fascinating design decisions. The layering of systems has produced a complex environment, with automatic and controlled highly mixed. This development over biological time has introduced constraints. As has the architecture--it takes time for slow signals to make their way from one area to another. And there are computational difficulties too: How much of its capabilities can the brain afford to invoke when a sub-second response is required? The tricks used leave traces. There are holes in our visual field that we continually cover up. There are certain sensory inputs that grab our attention faster and more thoroughly than we'd expect.

You don't need to know all of neuroscience, cognitive psychology and so on to know how your brain works. I'm not a neuroscientist. I write, my undergraduate degree is in physics, I hack in my spare time, and I work in new media. But neuroscience has got to such a level now - with the imaging techniques in the last three or four years - that we can make focused probes into particular functions, and illustrate the traces that these design decisions have left (see where+how they are, and draw that up the stack towards conscious experience) and we can look at them one by one.

And so you can learn how your brain works. I'm not talking about a map or general mechanisms ("there are neurons which are connected blah") and I'm not talking about really high-level stuff ("this bit is active when you're motivated blah"). I'm talking about minute-by-minute stuff: This is why you scratch your face when somebody else does. This is what will grab your attention in the corner of your eye, and this is what won't. Why the status icons in the corner of your desktop should be black and white and not in colour. That's what Brain Hacks is about, letting you see how all that works, from a standing start.

There's so much I want to say right now. From what I've learned, and the way it's changed how I look at the world - I can now follow the way my attention gets attached to the internal and external world, anticipate what's going to cause subliminal behaviour, and induce it in other people (but don't tell them I've been doing that), oh and the philosophical implications too - to the process: our use of a wiki for research and organisation (the most successful usage I've seen), the pitch process, the nature of writing, writing under pressure, re-learning how to follow citation trails, balance opinions. That can all wait.

For the moment, this is just a note to say: This is why I've been working part-time at the BBC for the last few months, and I'm enormously pleased (a) it's happening, and (b) I can tell everyone. This is almost certainly the coolest project I have ever worked on.

Now if you'll just excuse me. GODDAMN THIS IS SO BLOODY BRILLIANT. Sorry, I've been wanting to say that for ages.

(Tom S has also posted about this. He is also going slightly mad, seeing the world through the fog of neuroscience.)

Update on 7 December 2004: The book is officially called Mind Hacks, has been released, and we have a Mind Hacks weblog too. What a journey!