Let’s see if I can get perm-ignore on, on such an old post.
This whole line of thinking (press “on”, six million bodies fall) depends on a self-modifying AI being qualitatively different from a non-self-modifying one OR on self-modifying characteristics being the dominant strategy for achieving AI. In other words, there is a magic intelligence algorithm, which if implemented will lead to exponentially increasing intelligence, then you have to worry about the relative probability of that intelligence being in the Navel Gazing, Paperclips, and Friendly categories (and of course defining the categories) before you hit the switch on any candidate algorithm.
I think that intelligence is a very hard goal to hit, and that there is no self-contained, fast, non-iterative algorithm that rockets there. It is going to be much easier to build successive AIs with IQs of 10, 20, 30… (or 10, 20, 40...; the point remains) than to build a single AI which rockets itself off the scale. And in the process, we will need a lot of research in keeping the things stable, just to get up to 100, let alone to make it to 5000. We will also learn “what kind of animal” the unstable ones are, and what kind of mistake they tend to make. Progress will be slow—there is a long way from IQ100, when it starts to help us out, to IQ500, when it starts to be incomprehensible to humanity. In other words: it is not actually a whole lot easier to do it unfriendly-style than friendly-style.
That does not, of course, mean we should stop worrying about unfriendly risks. But it does mean that it is silly hubris to imagine that yelling “slow down until we figure this out” actually helps our chances of getting it right. We will stub our toe many times before we get the chance to blow our brains out, and anyone who is afraid to take a step because there might be a bullet out there underestimates the complexity and difficulty of the task we face.
Let’s see if I can get perm-ignore on, on such an old post.
This whole line of thinking (press “on”, six million bodies fall) depends on a self-modifying AI being qualitatively different from a non-self-modifying one OR on self-modifying characteristics being the dominant strategy for achieving AI. In other words, there is a magic intelligence algorithm, which if implemented will lead to exponentially increasing intelligence, then you have to worry about the relative probability of that intelligence being in the Navel Gazing, Paperclips, and Friendly categories (and of course defining the categories) before you hit the switch on any candidate algorithm.
I think that intelligence is a very hard goal to hit, and that there is no self-contained, fast, non-iterative algorithm that rockets there. It is going to be much easier to build successive AIs with IQs of 10, 20, 30… (or 10, 20, 40...; the point remains) than to build a single AI which rockets itself off the scale. And in the process, we will need a lot of research in keeping the things stable, just to get up to 100, let alone to make it to 5000. We will also learn “what kind of animal” the unstable ones are, and what kind of mistake they tend to make. Progress will be slow—there is a long way from IQ100, when it starts to help us out, to IQ500, when it starts to be incomprehensible to humanity. In other words: it is not actually a whole lot easier to do it unfriendly-style than friendly-style.
That does not, of course, mean we should stop worrying about unfriendly risks. But it does mean that it is silly hubris to imagine that yelling “slow down until we figure this out” actually helps our chances of getting it right. We will stub our toe many times before we get the chance to blow our brains out, and anyone who is afraid to take a step because there might be a bullet out there underestimates the complexity and difficulty of the task we face.