I think the closest we’ve got in understanding the 3 steps is in patterns of data storage, and the relation of data storage and the code that gives it behaviour, gives it life. SMTP gets close. Weblog posting APIs that are data centric, that have more functionality depending on the richness of the data format, those are close too.

So, the first step: You’re storing things, and you realise there needs to be abstraction. That’s the relational database. It makes sense to stand on its own. The idea is that everything is so normalised that any kind of action is a form of behaviour. You set a flag, or change a relationship, that’s it. Your possible modes of behaviour are encoded into the structure of the data—there are two things: the data itself, and it’s structure.

Second step: Objects wrap the data. The two are very tightly bound, and the data can only be got at through the code. But all objects offer methods that return in primitive data types—in other words, they all exist in the same space. The object is an abstraction surface that moves from the deep structure of the datastore to the surface stucture of method behaviours and primitives. There’s a very strong schema. It’s the schema that lets all the objects operate in the same space, as the schema all exist in some kind of Platonic Realm of Forms.

Third step: We’re getting to this in some worlds. The problem is that objects and data being tightly bound leaves problems with the evolution of the code. To allow for evolution, we get rid of the schema. We don’t say that something has a particular “type,” we say it has certain structures that we can do things with. The code “binds” to structures in the datastore. Instead of storing things with a Person schema, loading that, and sending emails, we bind the code to any object that offers the “email” and “name” datastructures. More complex datastructures offer more possibilities. It’s true that data can share characteristics, yet the parts that don’t share never really, properly translate to each other. There aren’t really any primitives that are universally applicable. Every data structure is the centre of its own universe.

This is duck typing.

Matt Webb, posted 2005-09-02 (talk on 2005-06-11)