There is a general problem with attempts to outwit an UFAI, to use it for one’s benefit in a way that is less than optimal for UFAI’s goals.
We are limited by our human capabilities for resolving logical uncertainty. This means that some facts that we don’t even notice can in fact be true, and some other facts that we assign very low subjective probability can in fact be true. In particular, while reasoning about a plan involving UFAI, we are limited in understanding of the consequences of that plan. The UFAI is much less limited, so it’ll see some of the possible consequences that are to its advantage that we won’t even consider, and it’ll take the actions that exploit those possible consequences.
So unless the argument is completely air-tight (electron-tight?), there is probably something you’ve missed, something that you won’t be even in principle able to notice, as a human, however long you think about the plan and however well you come to understand it, which UFAI will be able to use to further its goals more than you’ve expected, probably at the expense of your own goals.
You could equally well say the same thing if someone set out to prove that a cryptosystem was secure against an extremely powerful adversary, but I believe that we can establish this with reasonable confidence.
Computer scientists are used to competing with adversaries who behave arbitrarily. So if you want to say I can’t beat a UFAI at a game, you aren’t trying to tell me something about the UFAI—you are trying to tell me something about the game. To convince me that this could never work you will have to convince me that the game is hard to control, not that the adversary is smart or that I am stupid. You could argue, for example, that any game taking place in the real world is necessarily too complex to apply this sort of analysis to. I doubt this very strongly, so you will have to isolate some other property of the game that makes it fundamentally difficult.
So if you want to say I can’t beat a UFAI at a game, you aren’t trying to tell me something about the UFAI—you are trying to tell me something about the game.
I like this line, and I agree with it. I’m not sure how much more difficult this becomes when the game we’re playing is to figure out what game we’re playing.
I realize that this is the point of your earlier box argument—to set in stone the rules of the game, and make sure that it’s one we can analyze. This, I think is a good idea, but I suspect most here (I’m not sure if I include myself) think that you still don’t know which game you’re playing with the AI.
There is a general problem with attempts to outwit an UFAI, to use it for one’s benefit in a way that is less than optimal for UFAI’s goals.
We are limited by our human capabilities for resolving logical uncertainty. This means that some facts that we don’t even notice can in fact be true, and some other facts that we assign very low subjective probability can in fact be true. In particular, while reasoning about a plan involving UFAI, we are limited in understanding of the consequences of that plan. The UFAI is much less limited, so it’ll see some of the possible consequences that are to its advantage that we won’t even consider, and it’ll take the actions that exploit those possible consequences.
So unless the argument is completely air-tight (electron-tight?), there is probably something you’ve missed, something that you won’t be even in principle able to notice, as a human, however long you think about the plan and however well you come to understand it, which UFAI will be able to use to further its goals more than you’ve expected, probably at the expense of your own goals.
You could equally well say the same thing if someone set out to prove that a cryptosystem was secure against an extremely powerful adversary, but I believe that we can establish this with reasonable confidence.
Computer scientists are used to competing with adversaries who behave arbitrarily. So if you want to say I can’t beat a UFAI at a game, you aren’t trying to tell me something about the UFAI—you are trying to tell me something about the game. To convince me that this could never work you will have to convince me that the game is hard to control, not that the adversary is smart or that I am stupid. You could argue, for example, that any game taking place in the real world is necessarily too complex to apply this sort of analysis to. I doubt this very strongly, so you will have to isolate some other property of the game that makes it fundamentally difficult.
I like this line, and I agree with it. I’m not sure how much more difficult this becomes when the game we’re playing is to figure out what game we’re playing.
I realize that this is the point of your earlier box argument—to set in stone the rules of the game, and make sure that it’s one we can analyze. This, I think is a good idea, but I suspect most here (I’m not sure if I include myself) think that you still don’t know which game you’re playing with the AI.
Hopefully, the UFAI doesn’t get to mess with us while we figure out which game to play.
Thanks, that helps me transform my explicit knowledge of uFAI danger into more concrete fear.