You don’t need to know exactly what it will do. For example, in the chess playing case, you know that it will analyze chess positions, and pick a chess move. You don’t have to know exactly how it will do that analysis (although you do know it will analyze without gaining resources etc). The more intelligent it is, the better it will do that analysis.
Sure. But it seems to me that the very essence of what we call intelligence—of what would distinguish something we were happy to call “artificially intelligent” from, say, a very good chess-playing program—is precisely the fact of not operating solely within a narrowly defined domain like this.
Saying “Artificial general intelligence is perfectly safe: we’ll just only ever give it tasks as clearly defined and limited as playing chess” feels like saying “Nuclear weapons are perfectly safe: we’ll just make them so they can’t sustain fission or fusion reactions”.
Incidentally: in order to know that “it will analyse without gaining resources etc”, surely you do need to know pretty much exactly how it will do its analysis. Especially as “etc.” has to cover the whole panoply of ways in which a superintelligent AI might do things we don’t want it to do. So it’s not enough just to only give the AI tasks like “win this game of chess”; you have to constrain its way of thinking so that you know it isn’t doing anything you don’t completely understand. Which, I repeat, seems to me to take away all reasons for making a superintelligent AI in the first place.
I do not agree that you have to completely understand what it is doing. As long as it uses a fixed objective function that evaluates positions and outputs moves, and that function itself is derived from the game of chess and not from any premises concerned with the world, then it cannot do anything dangerous, even if you have no idea of the particulars of that function.
Also, I am not proposing that the function of an AI has to be this simple. This is a simplification to make the point easier to understand. The real point is that an AI does not have to have a goal in the sense of something like “acquiring gold”, that it should not have such a goal, and that we are capable of programming an AI in such a way as to ensure that it does not.
You don’t need to know exactly what it will do. For example, in the chess playing case, you know that it will analyze chess positions, and pick a chess move. You don’t have to know exactly how it will do that analysis (although you do know it will analyze without gaining resources etc). The more intelligent it is, the better it will do that analysis.
Sure. But it seems to me that the very essence of what we call intelligence—of what would distinguish something we were happy to call “artificially intelligent” from, say, a very good chess-playing program—is precisely the fact of not operating solely within a narrowly defined domain like this.
Saying “Artificial general intelligence is perfectly safe: we’ll just only ever give it tasks as clearly defined and limited as playing chess” feels like saying “Nuclear weapons are perfectly safe: we’ll just make them so they can’t sustain fission or fusion reactions”.
Incidentally: in order to know that “it will analyse without gaining resources etc”, surely you do need to know pretty much exactly how it will do its analysis. Especially as “etc.” has to cover the whole panoply of ways in which a superintelligent AI might do things we don’t want it to do. So it’s not enough just to only give the AI tasks like “win this game of chess”; you have to constrain its way of thinking so that you know it isn’t doing anything you don’t completely understand. Which, I repeat, seems to me to take away all reasons for making a superintelligent AI in the first place.
I do not agree that you have to completely understand what it is doing. As long as it uses a fixed objective function that evaluates positions and outputs moves, and that function itself is derived from the game of chess and not from any premises concerned with the world, then it cannot do anything dangerous, even if you have no idea of the particulars of that function.
Also, I am not proposing that the function of an AI has to be this simple. This is a simplification to make the point easier to understand. The real point is that an AI does not have to have a goal in the sense of something like “acquiring gold”, that it should not have such a goal, and that we are capable of programming an AI in such a way as to ensure that it does not.