The more I look at the comments, the more I am convinced that the AI Box experiment is too weak a demonstration of transhuman powers. Most of the proposals here fall under this basic trope (feel free to give a tvtropes link): to achieve what AI claims, it’d have to have powers formidable enough to not need the gatekeeper’s help getting out of the box in the first place. Given that, why would an AI need to talk to the gatekeeper at all?
So I suggest a modified AI boxing experiment: the gatekeeper designs an AI box with no communication channel at all. It will still have an AI inside and enough initial data fed in for the AI to foom. The AI will attempt to break out of the box by any and all means possible.
So, we’re being asked to imagine an arbitrary superhuman AI whose properties and abilities we can’t guess at except to specify arbitrarily, is trying to get out of a box whose security protocols and strength we can’t guess at except to specify arbitrarily, and trying to decide whether it does?
I always took the AI Box as being a specific subset of the meta-question: how can we be sure the AI is friendly?
“How do we completely isolate the AI” seems senseless since then we get ZERO information and have ZERO chance of releasing it, so why not save time and just not build the AI?
And, of course, I’d expect any reasonable approach to the meta-question to be more a matter of math and logic, and probably something where we don’t even have the framework to start directly answering it. Certainly not a forum game :)
On the other hand, games are fun, and they get people thinking, so coming up with new games that genuinely help us to frame the problem is still probably useful! And if not, I’ll still probably have fun playing them. It’s why I love this variant of the AI Box—it’s a quick, easy, and fun game that still taught me a lot about what I’d consider to be evidence-of-friendlines, what I was looking for as the gatekeeper :)
I always took the AI Box as being a specific subset of the meta-question: how can we be sure the AI is friendly?
And that subset was a demonstration that an unfriendly AI is unlikely be containable even if the communication channel is text-only.
“How do we completely isolate the AI” seems senseless since then we get ZERO information and have ZERO chance of releasing it, so why not save time and just not build the AI?
Of course completely isolating an AI is senseless. My (poorly expressed) point was that an AGI can probably get out regardless of the communication channel provided. Since we cannot go through all possible communication channels, I suggested that we simply block all channels and demonstrate that it can get out anyway. This would require someone designing a containment setup and someone else pointing out flaws in it. Security professionals do that every day.
Yes, but their constraints are based on the real world, whereas this one has a God-like AI which can gain control of a satellite by hacking the electrical system and then using the solar panels as sails… you’ve sort of assumed AI victory, and you’ve even stated this explicitly.
I see some benefit to a few quick examples like that, but I can’t see how it’s anything but tedious to keep going once you’ve established it can hijack the satellite and then mind control the ISS using morse code.
There’s nothing to learn, since the answer is always “The AI wins”, and you can replace the human player with a rock and get the same result. Games where one player can be replaced with a rock aren’t fun! :)
So, we’re being asked to imagine an arbitrary superhuman AI whose properties and abilities we can’t guess at except to specify arbitrarily
Quite a lot of discussion concerning the future superintelligent AI is of this sort: “we can’t understand it, therefore you can’t prove it wouldn’t do any arbitrary thing I assert.” This already makes discussion difficult.
The more I look at the comments, the more I am convinced that the AI Box experiment is too weak a demonstration of transhuman powers. Most of the proposals here fall under this basic trope (feel free to give a tvtropes link): to achieve what AI claims, it’d have to have powers formidable enough to not need the gatekeeper’s help getting out of the box in the first place. Given that, why would an AI need to talk to the gatekeeper at all?
So I suggest a modified AI boxing experiment: the gatekeeper designs an AI box with no communication channel at all. It will still have an AI inside and enough initial data fed in for the AI to foom. The AI will attempt to break out of the box by any and all means possible.
Here is a relevant previous thread.
So, we’re being asked to imagine an arbitrary superhuman AI whose properties and abilities we can’t guess at except to specify arbitrarily, is trying to get out of a box whose security protocols and strength we can’t guess at except to specify arbitrarily, and trying to decide whether it does?
Meh. Superman vs Batman is more entertaining.
Feel free to modify it in a way that makes sense to you.
I always took the AI Box as being a specific subset of the meta-question: how can we be sure the AI is friendly?
“How do we completely isolate the AI” seems senseless since then we get ZERO information and have ZERO chance of releasing it, so why not save time and just not build the AI?
And, of course, I’d expect any reasonable approach to the meta-question to be more a matter of math and logic, and probably something where we don’t even have the framework to start directly answering it. Certainly not a forum game :)
On the other hand, games are fun, and they get people thinking, so coming up with new games that genuinely help us to frame the problem is still probably useful! And if not, I’ll still probably have fun playing them. It’s why I love this variant of the AI Box—it’s a quick, easy, and fun game that still taught me a lot about what I’d consider to be evidence-of-friendlines, what I was looking for as the gatekeeper :)
And that subset was a demonstration that an unfriendly AI is unlikely be containable even if the communication channel is text-only.
Of course completely isolating an AI is senseless. My (poorly expressed) point was that an AGI can probably get out regardless of the communication channel provided. Since we cannot go through all possible communication channels, I suggested that we simply block all channels and demonstrate that it can get out anyway. This would require someone designing a containment setup and someone else pointing out flaws in it. Security professionals do that every day.
Yes, but their constraints are based on the real world, whereas this one has a God-like AI which can gain control of a satellite by hacking the electrical system and then using the solar panels as sails… you’ve sort of assumed AI victory, and you’ve even stated this explicitly.
I see some benefit to a few quick examples like that, but I can’t see how it’s anything but tedious to keep going once you’ve established it can hijack the satellite and then mind control the ISS using morse code.
There’s nothing to learn, since the answer is always “The AI wins”, and you can replace the human player with a rock and get the same result. Games where one player can be replaced with a rock aren’t fun! :)
Quite a lot of discussion concerning the future superintelligent AI is of this sort: “we can’t understand it, therefore you can’t prove it wouldn’t do any arbitrary thing I assert.” This already makes discussion difficult.