Seems like a decent reply overall, but I found the fourth point very unconvincing. Holden has said ‘what he knows know’ - to wit that whereas the world’s best experts would normally test a complicated programme by running it, isolating out what (inevitably) went wrong by examining the results it produced, rewriting it, then doing it again.
Almost no programmes are glitch free, so this is at best an optimization process and one which—as Holden pointed out—you can’t do with this type of AI. If (/when) it goes wrong the first time, you don’t get a second chance. Eliezer’s reply doesn’t seem to address this stark difference between what experts have been achieving and what SIAI is asking them to achieve.
I agree with the glitch problems. But (1) programmers and techniques are improving; (2) people are more careful when aware of danger; (3) if it’s hard but inevitable, giving up doesn’t sound like a winning strategy. I mean, if people make mistakes at some important task, how isn’t it a good idea to get lots of smart mathematicians to think hard about how to avoid mistakes?
Note that all doctors, biologists, nuclear physicists and rocket scientists are also not glitch free, but those that work with dangerous stuff do tend to err less often. But they have to be aware of the dangers (or at least anticipate their existence). A doctor might try a different pill if the first one doesn’t seem to work well against the sniffles, but will be much less inclined to experiments when they know the problem is a potential pandemic.
(By the way, it is probably possible that the first possible AGI is buggy, and a killer, and will foom in a few seconds (or before anyone can react, anyway); it might even be likely. But it’s still possible we’ll get several chances. My point is not that we don’t have to worry about anything, but that even if the chances might be low it still makes sense to try harder. And, hey, AFAIK the automatic trains in Paris work much better than the human-driven ones. It’s not quite a fair comparison in any direction, but there is evidence that we can make stuff work pretty well at least for a while.)
ETA: You know, now that I think about it, it seems plausible that programmer errors would lean towards the AGI not working (e.g. you divide by zero; core dump; the program stops), while a mathematician’s error would lean towards the AGI working but doing something catastrophic (e.g. your encryption program has exactly zero bugs, it works exactly as designed, but ROT13 has been proven cryptographically unsound after you used it to send that important secret). So maybe it’s a good idea if the math guys start thinking hard long in advance?
Seems like a decent reply overall, but I found the fourth point very unconvincing. Holden has said ‘what he knows know’ - to wit that whereas the world’s best experts would normally test a complicated programme by running it, isolating out what (inevitably) went wrong by examining the results it produced, rewriting it, then doing it again.
Almost no programmes are glitch free, so this is at best an optimization process and one which—as Holden pointed out—you can’t do with this type of AI. If (/when) it goes wrong the first time, you don’t get a second chance. Eliezer’s reply doesn’t seem to address this stark difference between what experts have been achieving and what SIAI is asking them to achieve.
I agree with the glitch problems. But (1) programmers and techniques are improving; (2) people are more careful when aware of danger; (3) if it’s hard but inevitable, giving up doesn’t sound like a winning strategy. I mean, if people make mistakes at some important task, how isn’t it a good idea to get lots of smart mathematicians to think hard about how to avoid mistakes?
Note that all doctors, biologists, nuclear physicists and rocket scientists are also not glitch free, but those that work with dangerous stuff do tend to err less often. But they have to be aware of the dangers (or at least anticipate their existence). A doctor might try a different pill if the first one doesn’t seem to work well against the sniffles, but will be much less inclined to experiments when they know the problem is a potential pandemic.
(By the way, it is probably possible that the first possible AGI is buggy, and a killer, and will foom in a few seconds (or before anyone can react, anyway); it might even be likely. But it’s still possible we’ll get several chances. My point is not that we don’t have to worry about anything, but that even if the chances might be low it still makes sense to try harder. And, hey, AFAIK the automatic trains in Paris work much better than the human-driven ones. It’s not quite a fair comparison in any direction, but there is evidence that we can make stuff work pretty well at least for a while.)
ETA: You know, now that I think about it, it seems plausible that programmer errors would lean towards the AGI not working (e.g. you divide by zero; core dump; the program stops), while a mathematician’s error would lean towards the AGI working but doing something catastrophic (e.g. your encryption program has exactly zero bugs, it works exactly as designed, but ROT13 has been proven cryptographically unsound after you used it to send that important secret). So maybe it’s a good idea if the math guys start thinking hard long in advance?