The Voder in 1939 and high-bandwidth input devices

18.37, Wednesday 9 Jun 2021

Computers are pretty low bandwidth when it comes to input. Fingers and and a couple square feet of looking, less with a phone. And not even much nuance. Swipes and taps, no vibrato or punching or feeling texture.

Back in the day, the Voder was the first device for voice synthesis, invented by Homer Dudley at Bell Labs and demonstrated by Helen Harper at the 1939 New York World’s Fair.

Here’s a video of the Voder in action (YouTube, 43 sec).

The user interface:

  • A wrist bar: The operator could select one of two basic sounds (breathing or speaking).
  • A foot pedal: pitch.
  • 14 keys to be used in chords to create various vowels and consonants, with several keys set aside for plosive sounds such as ‘p’ or ‘d’, and the affrictive sounds of the ‘j’ in ‘jaw’ and the ‘ch’ in ‘cheese’.


After months of practice, a trained operator could produce recognizable speech.

Another article:

It was a difficult and unnatural process, and only between 20-30 people ever even learned how to use it.

Our phones must see humans as one big eye and one big finger.

But at this point, 78% of the world’s population has used a smartphone. When can these tiny computers stop catering for new users by default?

Like, could computer input be full-bodied, high bandwidth? I want a computer which is more like a Voder.

I recently heard about vim-clutch which is a foot pedal that makes a particular text editor even more efficient to use (this is a great description from 2018 and here’s the project page).

I think of sewing machines, a device that requires a deftness of touch and also has a foot pedal, and I wonder whether Helen Harper, being the main Voder operator and one of the very few people skilled enough to play it, had experience with one.

I’ve had a taste of high-bandwidth input controlling my Mac cursor with my head – which I still use from time to time. My expectation was that I would use head control as an alternative to a mouse, leaning back, but actually I find myself using the keyboard, mouse, head cursor, facial expression triggers, and a little speech-to-text dictation for short phrases all at once. I would like to try a pedal, maybe to switch between “coarse” and “fine” modes. But the interface isn’t really designed for this.

