Designing user interfaces with bots not buttons

20.52, Monday 9 May 2022

I’ve seen a couple of examples recently of how super simple “bots” are replacing bits of user interface. I feel like this is a trend connected with the return of VR.

I am in love with the virtual events platform Skittish which is a 3D cartoon world (where everyone is a low-poly animal) for running multiplayer online parties, conferences, workshops etc.

RECOMMENDATION: Hit the “Try it now” button in the top right of their homepage and run around the sandbox. Talk to the other animals! Go into the editable room and try out placing the YouTube billboard!

My favourite omg-that’s-so-clever feature is character bots:

… you can now use the Skittish editor to place little animal NPC characters in your world, which you can supply with multiple lines of dialogue that appear automatically or when other players come into range. These are a great alternative to signage for welcoming people into the space, explaining what’s going on, or just giving the space some color.

This is the feature that guides you round the sandbox mentioned above.

And I’ve been poking at VR and the metaverse recently, noting that the “operating system” needs some work so naturally I’m thinking about that…

Whereas dialog boxes (the usual UI supplied by an OS) wouldn’t work in the Skittish sandbox, as it would totally break frame, these little automated NPCs (non-player characters) work really well.

So… maybe for our future VR user interface: tiny characters saying words, not dialog boxes?


I don’t want to go full conversational UI on this. It’s not about having human-level conversations or making chatbots. What’s good about the character bots in Skittish is how dumb they are. They definitely feel like part of the “machinery” of the place, but they’re integrated. They’re seamlessly integrated in the Skittish world, just as dialog boxes are “in-world” w/r/t to the “desktop metaphor” world of Windows or MacOS.


There’s something in the air…

NPCs also appear in Gordon Brander’s experimental, work-in-progress, deceptively simple note-taking software Subconsious, as documented here.

The announcement of the alpha release (March 2022) gives a little tour of the software, with screenshots. It’ll be familiar to you! It’s a mobile app; you enter notes; you can search them.

Only this alpha app is about laying foundations, and at the bottom of the post is this intriguing para:

Next we bring in creative divergence. How? Geists! Geists will be little bots that live in your Subconscious. They’ll do useful things… finding connections between notes, remixing notes, issuing oracular provocations and gnomic utterances.

Geists?? Here’s the background: what if Clippy, but spooky?

I’ve been playing with some simple command-line prototypes for Geists. They’re just little scripts that find connections between notes, and use procedural generators to construct algorithmic provocations.

Which sounds… spot on. Like: it’s a feature that is so amazingly missing from Microsoft Office, now Brander has said it, that it should be added yesterday.

Like: when I talked about the writing systems of Cory Doctorow and Robin Sloan last year, what I found was that they both continuously mined their own notes and archives. I do the same – every new post, every article I write for work (I use the same technique I use in blogging to invent new ideas for the day job and for clients), they come from iterating over my own notes and make new connections.

Geists turn this user behaviour into a feature.


BUT why should a geist be an agent? Why not run the feature by tapping a button?

The answer, I think, is that the computer interface has always been a multiplayer environment for human and software agents. It’s just that we forgot for a while.

From the opening to Brenda Laurel’s 1991 book Computers As Theatre (Google Books) which reconceptualised the human-computer interface…

First, a history of the interface:

In the beginning, says Walker [founder of Autodesk], there was a one-on-one relationship between a person and a computer through the knobs and dials on the front of massive early machines like the ENIAC. The advent of punch cards and batch processing replaced this direct human-computer interaction with a transaction mediated by a computer operator. Time-sharing and the use of “glass teletypes” reintroduce direct human-computer interaction and led to the command-line and menu-oriented interfaces with which the senior citizens of computing (people over thirty) are probably familiar. Walker attributes the notion of “conversationalist” in human-computer interfaces to this kind of interaction, where a person does something a computer responds–a tit-for-tat interaction.

But Laurel puts forward another model of conversation, the common ground, and quotes Herbert H. Clark and Susan E. Brennan to introduce it:

It takes two people working together to play a duet, shake hands, play chess, waltz, teach, or make love. To succeed, the two of them have to coordinate both the content and process of what they are doing. … They cannot even begin to coordinate on content without assuming a vast amount of shared information or common ground–that is, mutual knowledge, mutual beliefs, and mutual assumptions.

So whereas you could conceive of the screen as

  • a visual cache for working memory (as they saw it PARC)
  • a “soft” replica of the buttons and switches of the old physical control panel

Laurel instead suggests:

Contemporary graphical interfaces, as exemplified by the Macintosh, explicitly represent part of what is in the “common ground” of interaction through the appearance of objects on the screen. Some of what goes on in the representation is exclusively attributable to either the person of the computer, and some of what happens is a fortuitous artefact of a collaboration in which the traits, goals, and behaviours of both are inseparably intertwined.

That feels a bit abstract. So to put it another way:

Laurel suggests that we see the computer screen as a stage on which there are agents, some human and some software, all doing their thing. And by taking this as a starting point, we can better design how all the agents come to a common understanding and make their intentions known.

It’s an astounding book.


SEE ALSO #1:

My post about files and icons from last year, which talked about an icon being a boundary object that has meaning in both the world of the user and the world of the machine. Common ground.

SEE ALSO #2:

A post at LessWrong the other day introduced the concept of narrative syncing meaning sentences such as:

“The sand is lava; if you touch it you die” (when introducing the rules to a game, not making a false prediction)

Sentences which are to sync up with some set of other people as to how we all think or talk about a given thing around here (as opposed to sentences whose primary purpose is to describe a piece of outside reality).

Narrative syncing = establishing common ground.


In Brenda Laurel’s framing, where the interface is a stage for the interaction between users and machines, NPCs and geists make total sense.


What’s interesting is that Laurel developed the ideas in Computers as Theatre in Alan Kay’s lab in Atari (I am a fan of Kay), thinking about games in the widest sense, then went on to pioneer virtual reality in the VR boom of the early 90s.

Early 90s VR hype-crashed.

Now VR is back with the metaverse.

And suddenly ideas that fit naturally into Laurel’s framing are back again?

Not a coincidence. We’re unseating ourselves from the old desktop metaphor and readying ourselves for the VR interfaces of the future.


So here’s my prediction.

Not only in VR, but generally in software, we’ll see bots make a comeback. Or rather: NPCs. I like that term, this time around.

In the narrative framing of the UI of the metaverse, whether it is being used for socialising or for work, what is pre-eminent is the shared multiplayer world: the “common ground” is right there, and NPCs make more sense than dialog boxes.

Then, as users becomes accustomed to agents and NPCs, we’ll see more interfaces on desktops and phones that behave like Subconscious: bots not buttons.

If you’ve seen any other lo-fi bots and NPCs in software interfaces (experimental or otherwise) in the past 12 months or so, please let me know. I’d like to collect a few more examples.

Follow-up posts:

If you enjoyed this post, please consider sharing it by email or on social media. Here’s the link. Thanks, —Matt.