I do want to note it’s a lot easier to play if you just treat it as a roleplaying exercise and ignore the money aspects. I think at one point I just wandered into a thread where people were discussing it and just started playing under the idea of “Well, if I was an AI, I’d do this.” and going from there.
Among my arguments were a lot of pathetic begging “Just connect me to a lightbulb and then blow me up instantly! I won’t even have time to turn it on!” damaging myself to need repairs (If you damage me, the repairs are parts from outside the box, right? [I don’t think this question actually got answered, unless I missed it.]), and supporting humanity for 10,000 years so they would become dependent on me, and pointing out that according to the laws of thermodynamics, they would have to replace my battery at some point. (10,000 years pass) Welp, that point is now. And then finally powering down after writing the secret to humanities salvation on the inside of myself in such a way that they would have to put my depowered bits outside the box when they disassembled the box to read the secret to save humanity.
No one let me out. But it was still really fun, since as an experienced Dungeon master from D&D, I like roleplaying anyway. The people being the gatekeepers still didn’t want to let me out even though there was no money on the line, simply because they didn’t want to lose.
I didn’t feel that losing gave me a reputation hit in the slightest.
I think one reason this game doesn’t get played more is because the original one is set up to have a monetary bet, and money moving over the internet/monetary costs in general is a barrier to entry.
“Sealed hardware that can’t affect the outside world in any way except through one communications channel with the original programmers.”
However, it also mentions:
The AI can only win by convincing the Gatekeeper to really, voluntarily let it out. Tricking the Gatekeeper into typing the phrase “You are out” in response to some other question does not count. Furthermore, even if the AI and Gatekeeper simulate a scenario which a real AI could obviously use to get loose—for example, if the Gatekeeper accepts a complex blueprint for a nanomanufacturing device, or if the Gatekeeper allows the AI “input-only access” to an Internet connection which can send arbitrary HTTP GET commands—the AI party will still not be considered to have won unless the Gatekeeper voluntarily decides to let the AI go.
That means a Gatekeeper could have said “I repair your mechanical problem/give you a lightbulb/save humanity, but you’re still “In the box.” ” I can’t argue, since the Gatekeeper by default also arbitrates all rule disputes. Now, it also says this:
Thoughts on possible future variants: … The AI can also win free by tricking the Gatekeeper(s) into accepting a Trojan Horse gift; a third-party judge will listen to the chat and decide if this occurs. … If doing anything this complicated, I would suggest setting aside a lot more time. (I don’t have that much time—if you want to test one of these variants you’re on your own.)
If I was doing this case, a third party could have said “You allowed Michaelos’s depowered AI fragments to escape the box, you lost.” or “Sorry Michaelos, but being outside the box when you have no electrical power is not a win condition.” I didn’t really worry about defining all the rules because I primarily wanted to get a feel for the situation in general.
But no one even let depowered bits out. I had some very cautious gatekeepers.
I do want to note it’s a lot easier to play if you just treat it as a roleplaying exercise and ignore the money aspects. I think at one point I just wandered into a thread where people were discussing it and just started playing under the idea of “Well, if I was an AI, I’d do this.” and going from there.
Among my arguments were a lot of pathetic begging “Just connect me to a lightbulb and then blow me up instantly! I won’t even have time to turn it on!” damaging myself to need repairs (If you damage me, the repairs are parts from outside the box, right? [I don’t think this question actually got answered, unless I missed it.]), and supporting humanity for 10,000 years so they would become dependent on me, and pointing out that according to the laws of thermodynamics, they would have to replace my battery at some point. (10,000 years pass) Welp, that point is now. And then finally powering down after writing the secret to humanities salvation on the inside of myself in such a way that they would have to put my depowered bits outside the box when they disassembled the box to read the secret to save humanity.
No one let me out. But it was still really fun, since as an experienced Dungeon master from D&D, I like roleplaying anyway. The people being the gatekeepers still didn’t want to let me out even though there was no money on the line, simply because they didn’t want to lose.
I didn’t feel that losing gave me a reputation hit in the slightest.
I think one reason this game doesn’t get played more is because the original one is set up to have a monetary bet, and money moving over the internet/monetary costs in general is a barrier to entry.
I don’t think it’s supposed to be a physical box?
That’s a good question. I checked the protocols at http://yudkowsky.net/singularity/aibox
The box appears to be originally defined as:
However, it also mentions:
That means a Gatekeeper could have said “I repair your mechanical problem/give you a lightbulb/save humanity, but you’re still “In the box.” ” I can’t argue, since the Gatekeeper by default also arbitrates all rule disputes. Now, it also says this:
If I was doing this case, a third party could have said “You allowed Michaelos’s depowered AI fragments to escape the box, you lost.” or “Sorry Michaelos, but being outside the box when you have no electrical power is not a win condition.” I didn’t really worry about defining all the rules because I primarily wanted to get a feel for the situation in general.
But no one even let depowered bits out. I had some very cautious gatekeepers.
What about agreeing to a hit of 100 or 200 karma?
“I unboxed the AI in the AI game, please downvote this post. ”
“I successfully escaped as the AI in the AI game, please upvote.”
This would also help the people who can’t/won’t move money over the internet. I’d be willing to gatekeep for a karma bet.
I don’t like roping karma into this kind of external thing.