For more than a decade I have been systematically identifying error-prone
programming habits—by reviewing the literature, by analyzing other people’s
mistakes, and by analyzing my own mistakes—and redesigning my programming
environment to eliminate those habits. For example, “escape” mechanisms, such
as backslashes in various network protocols and % in printf, are error-prone:
it’s too easy to feed “normal” strings to those functions and forget about the
escape mechanism.
I switched long ago to explicit tagging of “normal” strings;
the resulting APIs are wordy but no longer error-prone. The combined result of
many such changes has been a drastic reduction in my own error rate.
Starting in 1997, I offered $500 to the first person to publish a verifiable
security hole in the latest version of qmail, my Internet-mail transfer agent; see
http://cr.yp.to/qmail/guarantee.html. There are now more than a million
Internet SMTP servers running qmail. Nobody has attempted to claim the $500.
Starting in 2000, I made a similar $500 offer for djbdns, my DNS software;
see http://cr.yp.to/djbdns/guarantee.html. This program is now used to
publish the IP addresses for two million .com domains: citysearch.com, for
example, and lycos.com. Nobody has attempted to claim the $500.
There were several non-security bugs in qmail, and a few in djbdns. My
error rate has continued to drop since then. I’m no longer surprised to whip up
a several-thousand-line program and have it pass all its tests the first time.
Bug-elimination research, like other user-interface research, is highly nonmathematical.
The goal is to have users, in this case programmers, make as few
mistakes as possible in achieving their desired effects. We don’t have any way
to model this—to model human psychology—except by experiment. We can’t
even recognize mistakes without a human’s help. (If you can write a program
to recognize a class of mistakes, great—we’ll incorporate your program into
the user interface, eliminating those mistakes—but we still won’t be able to
recognize the remaining mistakes.) I’ve seen many mathematicians bothered by
this lack of formalization; they ask nonsensical questions like “How can you prove
that you don’t have any bugs?” So I sneak out of the department, take off my
mathematician’s hat, and continue making progress towards the goal.
Personal experience: I found that I was able to reduce my bug rate pretty dramatically through application of moderate effort (~6 months of paying attention to what I was doing and trying to improve my workflow without doing anything advanced like screencasting myself or even taking dedicated self-improvement time), and I think it could probably be increased even more by adding many layers of process.
In any case, I think it makes sense to favor the development of bug reduction techs like version control, testing systems, type systems, etc. as part of a broad program of differential technological development. (I wonder how far you could go by analyzing almost every AGI failure mode as a bug of some sort, in the “do what I mean, not what I say” sense. The key issues being that bugs don’t always manifest instantly and sometimes change behavior subtly instead of immediately halting program execution. Maybe the “superintelligence would have tricky bugs” framing would be an easier sell for AI risks to computer scientists. The view would imply that we need to learn to write bug free code, including anticipating & preventing all AGI-specific bugs like wireheading, before building an AGI.)
See also: My proposal for how to structure FAI development.
Thanks, this is a great collection of relevant information.
I agree with your framing of this as differential tech development. Do you have any thoughts on the best routes to push on this?
I will want to think more about framing AGI failures as (subtle) bugs. My initial impression is positive, but I have some worry that it would introduce a new set of misconceptions.
I’m flattered that my thoughts as someone who has no computer science degree and just a couple years of professional programming experience are considered valuable. So here’s more info-dumping (to be taken with a grain of salt, like the previous info dump, because I don’t know what I don’t know):
My comment on different sorts of programming, and programming cultures, and how tolerant they are of human error. Quick overview of widely used bug reduction techs (code review and type systems should have been included).
Ben Kuhn saysthat writing machine learning code is unusually unforgiving, which accords well with my view that data science programming is unusually unforgiving (although the reasons aren’t completely the same).
Improving the way I managed my working memory seemed important to the way I reduced bugs in my code. I think by default things fall out of your working memory without you noticing, but if you allocate some of your working memory to watching your working memory, you can prevent this and solve problems in a slower but less error-prone way. The subjective sense was something like having a “meditative bulldozer” thinking style where I was absolutely certain of what I had done with each subtask before going on to the next. It’s almost exactly equivalent to doing a complicated sequence of algebraic operations correctly on the first try. It seems slower at first, but it’s generally faster in the long run, because fixing errors after the fact is quite slow. This sort of perfectionistic attention to detail was actually counterproductive for activities I worked on after quitting my job, like reading marketing books, and I worked to train it out. The feeling was one of switching in to a higher mental gear: I could no longer climb steep hills, but I was much faster on level ground. (For reference: my code wasn’t completely bug-free by any means, but I did have a reputation on the ops team of deploying unusually reliable data science code, I once joked to my boss on our company bug day that the system I maintained was bug-free and I didn’t have anything to do, which caused her to chuckle in assent, and my superiors were sad to see me leave the company.)
For our company hack day, I wrote a Sublime Text extension that would observe the files opened and searches performed by the user in order to generate a tree diagram attempting to map the user’s thought process. This seemed helpful for expanding my working memory with regard to a particular set of coding tasks (making a scattered set of coherent modifications in a large, thorny codebase).
Another thing that seemed useful was noticing how unexpected many of the bugs I put in to production were and trying to think about how I might have anticipated the bug in advance. I noticed that the most respected & senior programmers at the company I worked all had a level of paranoia that seemed excessive to me as a newbie, but I gradually learned to appreciate. (I worked at a company where we maintained a web app that was deployed every day, and we had a minimal QA team and no staging of releases, so us engineers had lots of opportunities to screw up.) Over time, I developed a sort of “peripheral vision” or lateral thinking capability that let me forsee some of these unexpected bugs the way the more senior engineers did.
Being stressed out seemed to impede my performance in writing bug-free code significantly (esp. the “peripheral vision” aspect, subjectively). The ideal seems to be a relaxed deep flow state. One of my coworkers was getting chewed out because a system he wrote continued causing nasty consumer-facing bugs through several rewrites. I was tasked with adding a feature to this system and determined that a substantial rewrite would be necessary to accommodate the feature. I was pleased with myself when my rewrite was bug-free practically on the first try, but I couldn’t help but think I had an unfair advantage because I wasn’t dealing with the stress of getting chewed out. I’m in favor of blameless post-mortems (larger antipattern: System 1 view of bugs as aversive stimuli to be avoided rather than valued stimuli to be approached; procrastinating on difficult tasks in favor of easy ones can be a really expensive and error-prone way to program. You want to use the context that’s already loaded in your head efficiently and solve central uncertainties before peripheral ones.)
(Of course I’m conveniently leaving out lots of embarrassing failures… for example, I once broke almost all of the clickable buttons on our website for several hours before someone noticed. I don’t think I ever heard of anyone else screwing up this bad. In my post-mortem I determined that my feeling that I had already done more than could reasonably expected on this particular task caused me to stop working when I ran out of steam instead of putting in the necessary effort to make sure my solution was totally correct. It took me a while to get good at writing low-bug code and arguably I’m still pretty bad at it.)
Do you have any thoughts on the best routes to push on this?
Hm, I guess one idea might be to try to obtain the ear of leading thinkers in the programming world—Joel Spolsky and Jeff Atwood types—to blog more about the virtues of bug prevention and approaches to it. My impression is that leading bloggers have more influence on software development culture than top CTOs or professors. But it wouldn’t necessarily need to be leading thinkers spearheading this; anyone can submit a blog post to Hacker News. I think what you’d want to do is compile a big list of devastating bugs and then sell a story that as software eats the world more and more, we need to get better at making it reliable.
http://cr.yp.to/cv/activities-20050107.pdf (apparently this guy’s accomplishments are legendary in crypto circles)
http://www.fastcompany.com/28121/they-write-right-stuff
Personal experience: I found that I was able to reduce my bug rate pretty dramatically through application of moderate effort (~6 months of paying attention to what I was doing and trying to improve my workflow without doing anything advanced like screencasting myself or even taking dedicated self-improvement time), and I think it could probably be increased even more by adding many layers of process.
In any case, I think it makes sense to favor the development of bug reduction techs like version control, testing systems, type systems, etc. as part of a broad program of differential technological development. (I wonder how far you could go by analyzing almost every AGI failure mode as a bug of some sort, in the “do what I mean, not what I say” sense. The key issues being that bugs don’t always manifest instantly and sometimes change behavior subtly instead of immediately halting program execution. Maybe the “superintelligence would have tricky bugs” framing would be an easier sell for AI risks to computer scientists. The view would imply that we need to learn to write bug free code, including anticipating & preventing all AGI-specific bugs like wireheading, before building an AGI.)
See also: My proposal for how to structure FAI development.
Thanks, this is a great collection of relevant information.
I agree with your framing of this as differential tech development. Do you have any thoughts on the best routes to push on this?
I will want to think more about framing AGI failures as (subtle) bugs. My initial impression is positive, but I have some worry that it would introduce a new set of misconceptions.
Sorry for the slow reply.
I’m flattered that my thoughts as someone who has no computer science degree and just a couple years of professional programming experience are considered valuable. So here’s more info-dumping (to be taken with a grain of salt, like the previous info dump, because I don’t know what I don’t know):
My comment on different sorts of programming, and programming cultures, and how tolerant they are of human error. Quick overview of widely used bug reduction techs (code review and type systems should have been included).
Ben Kuhn says that writing machine learning code is unusually unforgiving, which accords well with my view that data science programming is unusually unforgiving (although the reasons aren’t completely the same).
Improving the way I managed my working memory seemed important to the way I reduced bugs in my code. I think by default things fall out of your working memory without you noticing, but if you allocate some of your working memory to watching your working memory, you can prevent this and solve problems in a slower but less error-prone way. The subjective sense was something like having a “meditative bulldozer” thinking style where I was absolutely certain of what I had done with each subtask before going on to the next. It’s almost exactly equivalent to doing a complicated sequence of algebraic operations correctly on the first try. It seems slower at first, but it’s generally faster in the long run, because fixing errors after the fact is quite slow. This sort of perfectionistic attention to detail was actually counterproductive for activities I worked on after quitting my job, like reading marketing books, and I worked to train it out. The feeling was one of switching in to a higher mental gear: I could no longer climb steep hills, but I was much faster on level ground. (For reference: my code wasn’t completely bug-free by any means, but I did have a reputation on the ops team of deploying unusually reliable data science code, I once joked to my boss on our company bug day that the system I maintained was bug-free and I didn’t have anything to do, which caused her to chuckle in assent, and my superiors were sad to see me leave the company.)
For our company hack day, I wrote a Sublime Text extension that would observe the files opened and searches performed by the user in order to generate a tree diagram attempting to map the user’s thought process. This seemed helpful for expanding my working memory with regard to a particular set of coding tasks (making a scattered set of coherent modifications in a large, thorny codebase).
Another thing that seemed useful was noticing how unexpected many of the bugs I put in to production were and trying to think about how I might have anticipated the bug in advance. I noticed that the most respected & senior programmers at the company I worked all had a level of paranoia that seemed excessive to me as a newbie, but I gradually learned to appreciate. (I worked at a company where we maintained a web app that was deployed every day, and we had a minimal QA team and no staging of releases, so us engineers had lots of opportunities to screw up.) Over time, I developed a sort of “peripheral vision” or lateral thinking capability that let me forsee some of these unexpected bugs the way the more senior engineers did.
Being stressed out seemed to impede my performance in writing bug-free code significantly (esp. the “peripheral vision” aspect, subjectively). The ideal seems to be a relaxed deep flow state. One of my coworkers was getting chewed out because a system he wrote continued causing nasty consumer-facing bugs through several rewrites. I was tasked with adding a feature to this system and determined that a substantial rewrite would be necessary to accommodate the feature. I was pleased with myself when my rewrite was bug-free practically on the first try, but I couldn’t help but think I had an unfair advantage because I wasn’t dealing with the stress of getting chewed out. I’m in favor of blameless post-mortems (larger antipattern: System 1 view of bugs as aversive stimuli to be avoided rather than valued stimuli to be approached; procrastinating on difficult tasks in favor of easy ones can be a really expensive and error-prone way to program. You want to use the context that’s already loaded in your head efficiently and solve central uncertainties before peripheral ones.)
(Of course I’m conveniently leaving out lots of embarrassing failures… for example, I once broke almost all of the clickable buttons on our website for several hours before someone noticed. I don’t think I ever heard of anyone else screwing up this bad. In my post-mortem I determined that my feeling that I had already done more than could reasonably expected on this particular task caused me to stop working when I ran out of steam instead of putting in the necessary effort to make sure my solution was totally correct. It took me a while to get good at writing low-bug code and arguably I’m still pretty bad at it.)
Hm, I guess one idea might be to try to obtain the ear of leading thinkers in the programming world—Joel Spolsky and Jeff Atwood types—to blog more about the virtues of bug prevention and approaches to it. My impression is that leading bloggers have more influence on software development culture than top CTOs or professors. But it wouldn’t necessarily need to be leading thinkers spearheading this; anyone can submit a blog post to Hacker News. I think what you’d want to do is compile a big list of devastating bugs and then sell a story that as software eats the world more and more, we need to get better at making it reliable.