Computer Input Sucks—A Brain Dump
Epistemic Status: Just dumping my current thinking, and information I have on the topic, together with some possible solutions. I have not optimized super hard to be very accurate. I expect most of the value of this post comes from becoming aware of the issue.
It seems like we are in inadequate equilibria with regard to computer input.
I find it pretty weird that QWERTY is the default input method we use. It is very slow (much slower than you can think) and gives you RSI. It is possible to do much better than this, even without a BCI. I use the Dvorak keyboard layout. As far as I know, it does not make you type faster, but it does make your fingers move a lot less. This is good for preventing RSI. In Dvorak, the keys that are used the most frequently are in the home row of the keyboard.
However, the input speed problem remains. I write ~80 WPM on Dvorak. I think this slows me down. I can think much faster in language than this. This is especially noticeable in stream-of-thought writing, where you are optimizing a lot less for quality.
Alternative Input Machines
There are two alternative input schemes that I know about, that seem promising. These are Stenography and the CharaChordor One.
The goal of both of these systems is to write much faster. 225wpm is definitely possible with a lot of training in stenography. I am unsure about the top speed of the CharaChorder, but I would guess that it is at least 150, with the default setup. As far as I know, both of these systems do not induce RSI. Stenography was invented for court reporting. People still use it for that, and nowadays also for live captioning. Using it you can write for hours every day, without RSI becoming a problem.
Both of these systems speed you up, by enabling you to press multiple characters at the same time, which then creates a word or a syllable of a word.
I have heard that it takes 1 to 4 weeks to get up to 40WPM with the CharaChorder. Stenography takes much longer I think. I expect 100-400 hours of training time. CharaChorder has the advantage that you have all of the normal keyboard keys, whereas with Stenography it might be hard to use keyboard shortcuts to control your desktop environment, but it is definitely possible using plover.
The CharaChorder software is closed source, while plover is open source. The CharaChorder costs 300$, while you can get a steno keyboard for as little as 75-$140.
All the configuration of the CharaChorder is stored on the keyboard itself (at least if you use the default setup) so it is plug-and-play. Plover requires a bit of setup (~3-10min if you know what you are doing).
The last thing I have heard is that CharaChorder does not play very nice with vim, but they said that they would address this issue. I am unsure if this is fixed. I have not used the CharaChorder yet, but have ordered one that should arrive in a couple of weeks. I have learned Stenography for tens of hours and can confirm that it is very hard to learn. Much harder than learning how to touch type.
Automatic Speech Recognition
I have an S22 ultra, and I noticed that in Gboard I have a toggle in the options “Faster voice typing”. If I enable this, the ASR actually does not suck anymore and if I pronounce the words clearly, it has very high accuracy, even when I speak very fast. “AI alignment” is recognized correctly, though there are nonstandard words that are not recognized correctly.
When using the Gboard ASR, I estimate that I am slightly faster than using Dvorak, even when I take into account, that I need to reread the entire thing and correct any errors that showed up. I have not performed any rigorous tests though.
Walking While Writing
I have heard that walking stimulates your thinking ability. I think I can confirm this anecdotally. But if this is correct, then it seems crazy that we don’t have a standard solution for writing while walking (or at least not a solution that everybody knows about). I have a setup for this, that might solve this problem. I have not tested it yet.
The idea is to get AR classes (I have these), and a laptop harness. Then you get a very light laptop, put it in the harness, connect the AR glasses, such that they mirror the laptop screen and voila.
Alternatively, you could get a phone that can run the AR glasses together with a Bluetooth keyboard that can connect to the phone. Having a phone with DeX is probably good. Then just put the keyboard on the laptop harness, and type away.
[Update 2023-04-10]
I’m now using Whisper, speech-to-text, for most of the things that I’m writing. And this is much, much better than the Gboard speech-to-text. And it is so fast that I actually now expect that learning stenography and all the Charachorder is actually not worth it anymore.
I have written this program such that I can use Whisper anywhere on my system to enter text.
It also has other advantages over the alternatives. I can now transcribe conversations that I’m having, or while I have a conversation transcribe what I’m saying, such that I can look back over what I was saying when I lose the thread.
[Update 2023-04-25]
I have done a single test reading out loud this article for 1 minute and transcribing it with Whisper. I reached 197 WPM on the first try. I only got a single punctuation error.
Charachorder has a new pants attachement called the “wear-a-chorder”.
I use this with the X real air 2 pro ar glasses.
Connect both of these to my s20 ultra for samsung dex. OR a laptop tethered to a sim for internet which i stash in my backpack.
2 huge monitors in AR, a keyboard at my waist. And i can type at 120wpm while at the grocery store or sitting in the park.
Coding and doing web design anywhere.
I have become a cyborg.....
How long have you been using this setup? I’d be curious to see if you use it more than 2 weeks consistently.
Is CharaChorder actually that much faster? The examples in their video didn’t seem that impressive; it was basically the same speed I type at on a QWERTY keyboard, and while I type at an above-average speed I’m not like.. secretary level or anything.
Maybe you have looked at a weird video. There are two things you can do on the CharaChorder. You can enter characters one by one in character mode, or you can press multiple characters at the same time and then they get automatically rearranged into words. Based on all the characters that you have entered. If you just do character entry I would expect it’s not much faster than a keyboard. But the power comes from being able to press many keys at the same time and then immediately have a word or phrase pop up.
I watched the video on https://www.charachorder.com/products/charachorder-one (the one with the kid with glasses). They type somewhat fast but it didn’t really seem that much faster than I type with a QWERTY keyboard.
I noticed while typing this comment that I didn’t type quite as fast as them, but mostly because I was trying to compose what I wanted to write, not because I couldn’t type faster.
Based on some people that I talked to, it seems like you could get much faster than this. I spoke with one person that was a stenographer before and they said they could reach 200 words per minute. And what they are doing in the video is probably 120 words per minute. I’m not sure. Anyway, I think the best way to input text is using whisper speech-to-text right now anyway. At least if you take into account the learning curve and AI timelines.
If you don’t have issues with your current typing speed, like the ones I would describe in this comment, then probably it’s not worth for you to learn it.
I notice that I have two different speeds of thought: “raw ideas” and “coherent sentences”.
Writing at the speed of raw ideas is an interesting proposition because I don’t think humans have ever actually been able to do that. Checking this assumption, I find https://en.wikipedia.org/wiki/Words_per_minute with the un-cited claim that most handwriting tops out around 20wpm.
To my knowledge, the fastest that people ever tried to write by hand was by specialists using shorthand, which surprisingly the same wiki page claims can get up to 350wpm in competitions. That’s a bit like looking at the olympics and saying “humans can run at 27mph”, but setting a bound still seems useful.
For capturing ideas while walking, have you considered using voice recording and/or text-to-speech?
Personally, I touch-type pretty fast because for a few years in middle school, the internet was my primary means of social contact. I more or less learned to type as fast as I wanted to talk. Today, people who hear me typing often express surprise when they discover I was really typing and not just pretending to, because they thought it was too fast; for reference, a free online test just clocked me at 111wpm. I mention this because I do not personally have the problem of thinking in coherent sentences faster than my typing speed. Conversely, I often have to pause to think through what I’d like to say next, because my typing has out-paced the upcoming-sentences buffer in my thoughts.
When brainstorming, I can think ill-formed ideas faster than I can type them, but I address this issue by capturing such ideas using incomplete notes. I find that the process of turning a thought into a coherent sentence helps clarify and define the thought, and that whole process goes slower than my typing speed.
Considering that many of the traditionally extolled benefits of “writing” seem to revolve around that reification of ideas, I don’t think I understand the problem space in the same way you do. Have you identified an input speed that you’d consider adequate? How does the hypothetical adequate speed compare to your speed of speech?
Depending on what I’m talking about, I might speak a lot faster than I can write. I haven’t measured this, though I would guess maybe around 150 words per minute. I write around 80 words per minute.
When I’m speaking I don’t really refine all my thoughts and it’s more of a blurping out of stuff that I iteratively correct as I’m speaking. I might say something like, “Ah, I got the solution. It’s X. Wait, no, actually, this doesn’t work at all. Maybe it is instead Y, because Y doesn’t have this problem. Wait, no, actually I’m trying to get at a solution here, but I don’t even understand the problem yet. Let’s first try to get a better understanding of the problem. ”
I think some people think I’m pretty stupid when I talk like this because most of the things that I’m saying are wrong or obvious in ways that you realize when thinking for 3 seconds. There’s a big difference between doing exploratory thinking where you’re trying to understand something and regurgitating something that you have understood in the past. Most people tend to not do this kind of exploratory reasoning out loud because it can make you look stupid. Well, at least I have never met anybody who does this to the extent that I’m doing it.
Actually, just talking about it makes me realize that I haven’t been doing it as much in recent times. I think in part because some people really, really didn’t like me doing this, and made me feel bad for doing it. I think that made me subconsciously update toward not doing it as much anymore. I think this is bad, especially because I did not even notice that this was happening. Thank you for making me realize that.
I think that when I want to produce high-quality outputs then I am probably a lot slower than when I’m writing just for figuring out what is even going on. When doing exploratory writing I would want to write things down that are like the aforementioned example of: “Ah, I got the solution. It’s X. Wait, no, actually, this doesn’t work at all. Maybe it is instead Y, because Y doesn’t have this problem. Wait, no, actually I’m trying to get at a solution here, but I don’t even understand the problem yet. Let’s first try to get a better understanding of the problem. ”
And these sorts of trains of thought are generated a lot faster than I can type.
A completely different issue here is also that I am often writing in bursts, meaning I don’t have anything to say because I am thinking (possibly non-verbally) and then a finished idea pops into my head that I could articulate with words at over 200 words per minute. So, when I have this sort of break and then go dynamic, my writing speed also slows me down.
Update: I’m now using Whisper, speech-to-text, for most of the things that I’m writing. And this is much, much better than the Gboard speech-to-text. And it is so fast that I actually now expect that learning stenography and all the Charachorder is actually not worth it anymore.
I have written this program such that I can use Whisper anywhere on my system to enter text.
It also has other advantages over the alternatives. I can now transcribe conversations that I’m having, or while I have a conversation transcribe what I’m saying, such that I can look back over what I was saying when I lose the thread.
This is awesome. I found it via searching LW for variations of “voice typing”, Which I was motivated to search because I had just discovered that average conversational speed is around 3x average typing speed (~150 vs ~50 wpm, cf ChatGPT). (And reading speeds are potentially in the thousands.)
At the moment, I’m using Windows Voice Access. It’s accurate, has some nice voice commands and gives you visual feedback while speaking. The inadequacy, for me, is the lack of immediate feedback (compared to typing), and customisability. I’ll attempt to test your repo tomorrow.
Also, I’m surprised that stenotypers can type faster than regular keyboards. I had previously speculated about the benefits of making keyboards more or less modular. I thought making them more modular would face similar problems as Ithkuil: too many serial operations to build the vector you mean. English relies extremely on caching specific words for specific situations with very shallow generalisations, trading semantic reach for faster lookup-times, or something. But I guess what matters most is modularity in the right places, and mainstream keyboards aren’t optimised for that at all.
I speculate that the ideal keyboard should have keys that chunk word-pieces somewhat according to Zipf’s law: the Nth most common key should be such that the frequency by which you have to use it is ~1N. I guess there should also be a term somewhere for the distance the finger has to travel to reach a key or something, to be able to weight the frequency of keys by some measure for ergonomicity.
It’s likely too late, but you will not like my repo, because the transcription only happens once you have finished the recording. You could of cause do it better but I didn’t and probably won’t.
The main reason is that I become more stupid when using ASR. My current theory is that when you are writing something down, the slowdown effect is actually beneficial because you have more cognitive resources available to compute the next thing you write. Also, the lag in voice typing is pretty horrible. I still use it to write things where I just know what to say, and saying the correct thing is easy. But if I try to do something complicated it results in much worse text, that I am even less likely to read than things I write.
I can imagine an interesting experiment, where somebody knows stenography, and then they use a program that limits the input rate. I.e. there is a lock for Xms before you can enter the next cord. Then you could test how much text quality improves by increasing the rate limit.
I’ve noticed the same when voice-typing, and I considered that explanation. I don’t think it’s the main cause, however. With super-fast and accurate STT (or steno), I suspect I could learn to both think better and type faster. There’s already an almost-audible internal monologue going on while I type. Adding the processing-cost of having to translate internal-audio-code into very unnatural finger-stroke-code is surely a suboptimum. (I’m reminded of the driver who, upon noticing that he’s got a flat tire on one side, punctures the other as well. It’s a deceptive optimum; a myopic plan.)
I think there is likely no significant cognitive overhead for moving your fingers to type. I expect that is done by another part of your brain, which plays at most a secondary role in idea generation. I expect the same problem to show up in Stenography when you type as fast as you generate content.
You can perform this experiment right now. Instead of writing with a keyboard, write with pen and paper. When I write with pen and paper my writing improves in quality. And it seems this is because I am slower. The thing I actually end up writing down pops into my mind with such a delay, that I would have already written down a worse previously generated output, had I only been able to write faster. Potentially there are other variables influencing quality. E.g. your motor cortex is stimulated differently when using pen and paper, compared to using QWERTY.
[Thoughts ↦ speech-code ↦ text-code] just seems like a convoluted/indirect learning-path. Speech has been optimised directly (although very gradually over thousands of years) to encode thoughts, whereas most orthographies are optimised to encode sounds. The symbols are optimised only via piggy-backing on the thoughts↦speech code—like training a language-model indirectly via NTP on [the output of an architecturally-different language-model trained via NTP on human text].
(In the conlang-orthography I aspire to make with AI-assistance, graphemes don’t try to represent sounds at all. So sort of like a logogram but much more modular & compact.)
Interesting.
Anecdote: When I think to myself without writing at all (eg shower, walking, waiting, lying in bed), I tend to make deeper progress on isolated idea-clusters. Whereas when I use my knowledge-network (RemNote), I often find more spontaneous+insightfwl connections between remote idea-clusters (eg evo bio, AI, economics). This is bc when I write a quick note into RemNote, I heavily prioritise finding the right tags & portalling it into related concepts. Often I simply spam related concepts at the top like this:
The links are to concepts I’ve already spotted metaphors / use-cases for, or I have a hunch that one might be there. It prompts me to either review or flesh out the connections next time I visit the note.
I think these styles complement each other very well.