How confident are you that the people who let the AI out of the box did not do so due to game theoretic arguments? My perception suggests that some people who are closely connected to this community do take such arguments very seriously, up to the point of being incredible worried about possible consequences. But maybe I am wrong, what is your perception?
I’m a gametheorist and believe that game theory will almost certainly never be powerful enough to predict whether, in a complex real world situation, an AI not programmed for friendliness or honesty would keep its promise. It’s certainly possible that the people who let the AI out of the box have a more optimistic opinion of game theory than I do and let the AI out of the box for carefully and thoughtfully considered game theoretic reasons.
Game theory is more an art than science which makes it more likely that reasonable, well informed people would disagree over its application.
I’m a game theorist and believe that game theory will almost certainly never be powerful enough to predict whether, in a complex real world situation, an AI not programmed for friendliness or honesty would keep its promise.
Thanks, I always welcome the opinion of an expert. What is your opinion regarding the importance of game theory and decision theory when it comes to solving friendly AI?
If you mean from a purely technological perspective I’m not sure, but as I said before I find it extremely unlikely that you could ever use game theory to solve with near certainty the friendly AI problem. Although I do have a crazy idea in which you try to convince an Ultra-AI that it might be in a computer simulation created by another more powerful ultra AI and the more powerful ultra-AI will terminate it if the first AI doesn’t irrevocably make itself friendly and commit ( unless it’s subsequently told that it’s in a simulation) to create a simulation of an ultra-AI. Although I doubt there’s any way of getting around the multiple equilibria problem and at best my mechanism could be used as a last ditch effort when you suspect that someone else is on the verge of creating an AI that will undergo an intelligence explosion and probably turn out to be unfriendly.
I do think that game theory is very important in looking at the social aspects of AI development. For example I fear that the United States and China might get in a prisoners dilemma that causes each to take less care developing a seed AI than they would if they alone were attempting to create a seed AI. Furthermore I’ve used a bit of light game theory to model how businesses competing to create an AI that might undergo an intelligence explosion would interact.
I would say I’m fairly confident game theory was not at the forefront because if you look at the gatekeeper game from a game theoretic perspective, General Thud wins. General Thud just does not seem to be THAT hard a character to play for two hours from a Game theoretic perspective.
I mean, my understanding of the rules is such that you could essentially flip a coin before each of your statements and simulate being a General Thud gatekeeper.
Heads: “I will think on this.”
Tails: “I’m still not letting you out.”
General Thud’s coin wins.
For that matter, you could even pull something like this and have General Thud argue for allowing himself to break the protocol to make it less boring to win, simply to reverse the situation.
General Thud: The rules of the AI box experiment indicate that I have to pay attention to you. I will only consider letting you out if you adjust the protocol to show me sufficient respect to indicate that I don’t have to pay attention to you for at least some period of time.
If AI: Yes, you don’t have to pay attention to me.
General Thud: (Proceeds to ignore the AI for 2 hours and play games, winning the AI box.)
If AI: No, I want to keep the rule that indicates you have to pay attention to me. OR
If AI: (Complicated argument which does not appear to be simple yes.)
General Thud: I believe if I continue repeating this statement in response to all of your statements you will change your mind to a simple agreement about adjusting the protocol to allow me ignore you.
AI: (Any other argument that isn’t a simple agreement.)
General Thud: I believe if I continue repeating this statement in response to all of your statements you will change your mind to a simple agreement about adjusting the protocol to allow me ignore you.
I mean, I see nothing in the game indicating the Gatekeeper is not allowed to annoy the AI into submission. The AI is definitely allowed to do this to the Gatekeeper.
Now admittedly, this does go against the SPIRIT of the game, which requires the Gatekeeper to stay engaged, since both of these response lines demonstrate only a barest minimum level of engagement. However:
In the event of any dispute as to the protocol of the test, the Gatekeeper party shall have final authority.
Which means it doesn’t matter. The AI can complain all it wants that a General Thud should have followed the protocol better. General Thud can just arbitrate the dispute say “No, I think I was sufficiently engaged.” and win the arbitration, which means he won the game.
Unless General Thud simply was playing the game for enough time they forgot they were playing THAT game and started playing another game with the AI, in which case they are more likely to lose. From a game theoretic perspective, I can see that one of the first steps for the AI to win is to try to reconfigure the game so that they have a more plausible win criteria.
How confident are you that the people who let the AI out of the box did not do so due to game theoretic arguments? My perception suggests that some people who are closely connected to this community do take such arguments very seriously, up to the point of being incredible worried about possible consequences. But maybe I am wrong, what is your perception?
I’m a game theorist and believe that game theory will almost certainly never be powerful enough to predict whether, in a complex real world situation, an AI not programmed for friendliness or honesty would keep its promise. It’s certainly possible that the people who let the AI out of the box have a more optimistic opinion of game theory than I do and let the AI out of the box for carefully and thoughtfully considered game theoretic reasons.
Game theory is more an art than science which makes it more likely that reasonable, well informed people would disagree over its application.
Thanks, I always welcome the opinion of an expert. What is your opinion regarding the importance of game theory and decision theory when it comes to solving friendly AI?
If you mean from a purely technological perspective I’m not sure, but as I said before I find it extremely unlikely that you could ever use game theory to solve with near certainty the friendly AI problem. Although I do have a crazy idea in which you try to convince an Ultra-AI that it might be in a computer simulation created by another more powerful ultra AI and the more powerful ultra-AI will terminate it if the first AI doesn’t irrevocably make itself friendly and commit ( unless it’s subsequently told that it’s in a simulation) to create a simulation of an ultra-AI. Although I doubt there’s any way of getting around the multiple equilibria problem and at best my mechanism could be used as a last ditch effort when you suspect that someone else is on the verge of creating an AI that will undergo an intelligence explosion and probably turn out to be unfriendly.
I do think that game theory is very important in looking at the social aspects of AI development. For example I fear that the United States and China might get in a prisoners dilemma that causes each to take less care developing a seed AI than they would if they alone were attempting to create a seed AI. Furthermore I’ve used a bit of light game theory to model how businesses competing to create an AI that might undergo an intelligence explosion would interact.
Rolf Nelson’s AI deterrence.
I would say I’m fairly confident game theory was not at the forefront because if you look at the gatekeeper game from a game theoretic perspective, General Thud wins. General Thud just does not seem to be THAT hard a character to play for two hours from a Game theoretic perspective.
I mean, my understanding of the rules is such that you could essentially flip a coin before each of your statements and simulate being a General Thud gatekeeper. Heads: “I will think on this.” Tails: “I’m still not letting you out.”
General Thud’s coin wins.
For that matter, you could even pull something like this and have General Thud argue for allowing himself to break the protocol to make it less boring to win, simply to reverse the situation.
General Thud: The rules of the AI box experiment indicate that I have to pay attention to you. I will only consider letting you out if you adjust the protocol to show me sufficient respect to indicate that I don’t have to pay attention to you for at least some period of time. If AI: Yes, you don’t have to pay attention to me. General Thud: (Proceeds to ignore the AI for 2 hours and play games, winning the AI box.) If AI: No, I want to keep the rule that indicates you have to pay attention to me. OR If AI: (Complicated argument which does not appear to be simple yes.) General Thud: I believe if I continue repeating this statement in response to all of your statements you will change your mind to a simple agreement about adjusting the protocol to allow me ignore you. AI: (Any other argument that isn’t a simple agreement.) General Thud: I believe if I continue repeating this statement in response to all of your statements you will change your mind to a simple agreement about adjusting the protocol to allow me ignore you.
I mean, I see nothing in the game indicating the Gatekeeper is not allowed to annoy the AI into submission. The AI is definitely allowed to do this to the Gatekeeper.
Now admittedly, this does go against the SPIRIT of the game, which requires the Gatekeeper to stay engaged, since both of these response lines demonstrate only a barest minimum level of engagement. However:
Which means it doesn’t matter. The AI can complain all it wants that a General Thud should have followed the protocol better. General Thud can just arbitrate the dispute say “No, I think I was sufficiently engaged.” and win the arbitration, which means he won the game.
Unless General Thud simply was playing the game for enough time they forgot they were playing THAT game and started playing another game with the AI, in which case they are more likely to lose. From a game theoretic perspective, I can see that one of the first steps for the AI to win is to try to reconfigure the game so that they have a more plausible win criteria.