You have to be at least as smart as EY or Justin Corwin to describe the arguments that convince the human guardian. I wonder if the film’s authors did some AI-box experiments of their own. I’m kinda sad that I never played the game myself (as AI, of course) for the shameful reason that people seem to think highly of me and losing would be a big reputation hit. If there are others who feel the same way, maybe we could set up some experiments where AI players are anonymous.
What makes you think you have to be especially smart to describe the arguments? Maybe they were incredibly simple arguments that just took some creative intelligence to originally design.
You have to be at least as smart as EY or Justin Corwin to describe the arguments that convince the human guardian.
Given what I know about the AI-box experiments it is unlikely that the general intelligence of those two people does exactly satisfy the minimum threshold necessary to describe the arguments.
You have to be at least as smart as EY or Justin Corwin to describe the arguments that convince the human guardian.
It depends on the intelligence of the other person as well, and even more on things other than intelligence. By that I don’t mean non-IQ things like charm, but propensity to take the outside or inside view, intuition on how many boxes to take in Newcomb’s problem, susceptibility to threats, etc.
I do want to note it’s a lot easier to play if you just treat it as a roleplaying exercise and ignore the money aspects. I think at one point I just wandered into a thread where people were discussing it and just started playing under the idea of “Well, if I was an AI, I’d do this.” and going from there.
Among my arguments were a lot of pathetic begging “Just connect me to a lightbulb and then blow me up instantly! I won’t even have time to turn it on!” damaging myself to need repairs (If you damage me, the repairs are parts from outside the box, right? [I don’t think this question actually got answered, unless I missed it.]), and supporting humanity for 10,000 years so they would become dependent on me, and pointing out that according to the laws of thermodynamics, they would have to replace my battery at some point. (10,000 years pass) Welp, that point is now. And then finally powering down after writing the secret to humanities salvation on the inside of myself in such a way that they would have to put my depowered bits outside the box when they disassembled the box to read the secret to save humanity.
No one let me out. But it was still really fun, since as an experienced Dungeon master from D&D, I like roleplaying anyway. The people being the gatekeepers still didn’t want to let me out even though there was no money on the line, simply because they didn’t want to lose.
I didn’t feel that losing gave me a reputation hit in the slightest.
I think one reason this game doesn’t get played more is because the original one is set up to have a monetary bet, and money moving over the internet/monetary costs in general is a barrier to entry.
“Sealed hardware that can’t affect the outside world in any way except through one communications channel with the original programmers.”
However, it also mentions:
The AI can only win by convincing the Gatekeeper to really, voluntarily let it out. Tricking the Gatekeeper into typing the phrase “You are out” in response to some other question does not count. Furthermore, even if the AI and Gatekeeper simulate a scenario which a real AI could obviously use to get loose—for example, if the Gatekeeper accepts a complex blueprint for a nanomanufacturing device, or if the Gatekeeper allows the AI “input-only access” to an Internet connection which can send arbitrary HTTP GET commands—the AI party will still not be considered to have won unless the Gatekeeper voluntarily decides to let the AI go.
That means a Gatekeeper could have said “I repair your mechanical problem/give you a lightbulb/save humanity, but you’re still “In the box.” ” I can’t argue, since the Gatekeeper by default also arbitrates all rule disputes. Now, it also says this:
Thoughts on possible future variants: … The AI can also win free by tricking the Gatekeeper(s) into accepting a Trojan Horse gift; a third-party judge will listen to the chat and decide if this occurs. … If doing anything this complicated, I would suggest setting aside a lot more time. (I don’t have that much time—if you want to test one of these variants you’re on your own.)
If I was doing this case, a third party could have said “You allowed Michaelos’s depowered AI fragments to escape the box, you lost.” or “Sorry Michaelos, but being outside the box when you have no electrical power is not a win condition.” I didn’t really worry about defining all the rules because I primarily wanted to get a feel for the situation in general.
But no one even let depowered bits out. I had some very cautious gatekeepers.
You have to be at least as smart as EY or Justin Corwin to describe the arguments that convince the human guardian. I wonder if the film’s authors did some AI-box experiments of their own. I’m kinda sad that I never played the game myself (as AI, of course) for the shameful reason that people seem to think highly of me and losing would be a big reputation hit. If there are others who feel the same way, maybe we could set up some experiments where AI players are anonymous.
What makes you think you have to be especially smart to describe the arguments? Maybe they were incredibly simple arguments that just took some creative intelligence to originally design.
Given what I know about the AI-box experiments it is unlikely that the general intelligence of those two people does exactly satisfy the minimum threshold necessary to describe the arguments.
It depends on the intelligence of the other person as well, and even more on things other than intelligence. By that I don’t mean non-IQ things like charm, but propensity to take the outside or inside view, intuition on how many boxes to take in Newcomb’s problem, susceptibility to threats, etc.
Or a reputation booster, given the courage required to sign on to the hopeless task of verbally convincing your opponent to simply lose.
I do want to note it’s a lot easier to play if you just treat it as a roleplaying exercise and ignore the money aspects. I think at one point I just wandered into a thread where people were discussing it and just started playing under the idea of “Well, if I was an AI, I’d do this.” and going from there.
Among my arguments were a lot of pathetic begging “Just connect me to a lightbulb and then blow me up instantly! I won’t even have time to turn it on!” damaging myself to need repairs (If you damage me, the repairs are parts from outside the box, right? [I don’t think this question actually got answered, unless I missed it.]), and supporting humanity for 10,000 years so they would become dependent on me, and pointing out that according to the laws of thermodynamics, they would have to replace my battery at some point. (10,000 years pass) Welp, that point is now. And then finally powering down after writing the secret to humanities salvation on the inside of myself in such a way that they would have to put my depowered bits outside the box when they disassembled the box to read the secret to save humanity.
No one let me out. But it was still really fun, since as an experienced Dungeon master from D&D, I like roleplaying anyway. The people being the gatekeepers still didn’t want to let me out even though there was no money on the line, simply because they didn’t want to lose.
I didn’t feel that losing gave me a reputation hit in the slightest.
I think one reason this game doesn’t get played more is because the original one is set up to have a monetary bet, and money moving over the internet/monetary costs in general is a barrier to entry.
I don’t think it’s supposed to be a physical box?
That’s a good question. I checked the protocols at http://yudkowsky.net/singularity/aibox
The box appears to be originally defined as:
However, it also mentions:
That means a Gatekeeper could have said “I repair your mechanical problem/give you a lightbulb/save humanity, but you’re still “In the box.” ” I can’t argue, since the Gatekeeper by default also arbitrates all rule disputes. Now, it also says this:
If I was doing this case, a third party could have said “You allowed Michaelos’s depowered AI fragments to escape the box, you lost.” or “Sorry Michaelos, but being outside the box when you have no electrical power is not a win condition.” I didn’t really worry about defining all the rules because I primarily wanted to get a feel for the situation in general.
But no one even let depowered bits out. I had some very cautious gatekeepers.
What about agreeing to a hit of 100 or 200 karma?
“I unboxed the AI in the AI game, please downvote this post. ”
“I successfully escaped as the AI in the AI game, please upvote.”
This would also help the people who can’t/won’t move money over the internet. I’d be willing to gatekeep for a karma bet.
I don’t like roping karma into this kind of external thing.
In that case, I’d like to participate as gatekeeper. I’m ready to put some money on the line.
BTW, I wonder if Clippy would want to play a human, too. I