I think that you are missing the point. That’s not a useful approach. Pointing out simple and accessible loopholes in a proposed containment method is more likely to drive home the point that AI boxing is in general a lousy idea. Or maybe it’s not that terrible, if someone suggests a containment method no one can crack.
This would not be a useful AI box if the setup works as intended, but if a bit of escapology gets one out, then there is no point in discussing how to make a useful box.
Pointing out simple and accessible loopholes in a proposed containment method is more likely to drive home the point that AI boxing is in general a lousy idea.
That was the primary point I wanted to make, though.
Something as intelligent as we theorize Strong AGI could be would most likely be prone to using magic if it wanted to get out.
Also, trying to think of specific examples of methods to unbox oneself in this scenario seems counter-productive, since I assume it would constrict our thinking to mostly those suggestions, by way of heuristics and biases both of us are quite familiar with.
Do you then consider the original text-only channel AI boxing experiment counter-productive, as well? After all, one can always postulate that a trans-human can talk its way out of a troublesome predicament, rather than demonstrate that doing so does not even require a transhuman, but only a determined human.
I have a hard time judging correctly whether I’d consider it similarly counter-productive under this context with this heuristic.
My initial reaction to both experiments is that both situations defeat the point of researching and building a Friendly AI in the first place—we want it to solve problems we don’t understand or solve them faster than we could, and it’s unlikely that we would indefinitely understand all such solutions even after careful evaluation, at which point we might as well be giving the AI unbridled access to molecular 3D printers (the magic postulate applies here too).
So that’s one way discussing them is somewhat counter-productive. Another way is in the examples it brings up constraining our thinking once some are made available. However, the text-only boxed AI was brought up as a “legitimate safety proposal”, from what I heard, and from there came about the AI Box Experiment that E.Y. performed, to show that we most likely haven’t thought of every single possibility.
However, the point of the experiment was to demonstrate that it was done, without revealing specific examples. E.Y. revealing the specific method he used to unbox himself seems as counterproductive, IMO, as revealing here specific hypotheses we might have for unboxing in this scenario.
What could be productive, however, is a human making a similar demonstration by making a disconnected, isolated machine somehow send a signal to something outside of it—without revealing the exact method.
Ideally, this would be in a closely monitored lab setting secured by various hackers to make sure there’s no easy loopholes or backdoors (e.g. no unsecured metallic parts near the machine that can be easily manipulated with magnetic fields), where the unboxer manages to somehow get out anyway.
I can’t even begin to think of anyone capable of this, or any way I would even start to find a solution, and that scares me even more. It probably means I definitely haven’t thought of everything.
I do. It implies that it is actually feasible to construct a text-only channel, which as a programmer I can tell you is not the case.
If you build your AI on an existing OS running on commercial hardware there are going to be countless communication mechanisms and security bugs present for it to take advantage of, and the attack surface of the OS is far too large to secure against even human hackers. The fact that you’ll need multiple machines to run it with current hardware amplifies this problem geometrically, and makes the idea that a real project could achieve complete isolation hopelessly naive. In reality you’ll discover that there was an undocumented Bluetooth chip on one of the motherboards, or the wireless mouse adapter uses a duel-purpose chip that supports WiFi, or one of the power supplies supports HomePNA and there was another device on the grid, or something else along those lines.
The alternative is building your own (very feature-limited) hardware, to run your own (AI-support-only) OS. In theory you might be able to make such a system secure, but in reality no one is ever going to give you the hundreds of millions of $$ it would cost to build the thing. Not to mention that a project that tries this approach will have to spend years duplicating hardware and software work that has already been done a hundred times before, putting it far behind any less cautious competitors...
Because you can’t create real, 100% physical isolation. At a minimum you’re going to have power lines that breach the walls, and either people moving in and out (while potentially carrying portable electronics) or communication lines going out to terminals that aren’t isolated. Also, this kind of physical facility is very expensive to build, so the more elaborate your plan is the less likely it is to get financed.
Military organizations have been trying to solve these problems ever since the 1950s, with only a modest degree of success. Even paranoid, well-funded organizations with a willingness to shoot people have security breaches on a fairly regular basis.
I think that you are missing the point. That’s not a useful approach. Pointing out simple and accessible loopholes in a proposed containment method is more likely to drive home the point that AI boxing is in general a lousy idea. Or maybe it’s not that terrible, if someone suggests a containment method no one can crack.
Why would someone want to set up a boxed AI when they aren’t planning on getting information out of the AI?
Let me quote my other comment:
That was the primary point I wanted to make, though.
Something as intelligent as we theorize Strong AGI could be would most likely be prone to using magic if it wanted to get out.
Also, trying to think of specific examples of methods to unbox oneself in this scenario seems counter-productive, since I assume it would constrict our thinking to mostly those suggestions, by way of heuristics and biases both of us are quite familiar with.
Do you then consider the original text-only channel AI boxing experiment counter-productive, as well? After all, one can always postulate that a trans-human can talk its way out of a troublesome predicament, rather than demonstrate that doing so does not even require a transhuman, but only a determined human.
Hmm, good pointy question.
I have a hard time judging correctly whether I’d consider it similarly counter-productive under this context with this heuristic.
My initial reaction to both experiments is that both situations defeat the point of researching and building a Friendly AI in the first place—we want it to solve problems we don’t understand or solve them faster than we could, and it’s unlikely that we would indefinitely understand all such solutions even after careful evaluation, at which point we might as well be giving the AI unbridled access to molecular 3D printers (the magic postulate applies here too).
So that’s one way discussing them is somewhat counter-productive. Another way is in the examples it brings up constraining our thinking once some are made available. However, the text-only boxed AI was brought up as a “legitimate safety proposal”, from what I heard, and from there came about the AI Box Experiment that E.Y. performed, to show that we most likely haven’t thought of every single possibility.
However, the point of the experiment was to demonstrate that it was done, without revealing specific examples. E.Y. revealing the specific method he used to unbox himself seems as counterproductive, IMO, as revealing here specific hypotheses we might have for unboxing in this scenario.
What could be productive, however, is a human making a similar demonstration by making a disconnected, isolated machine somehow send a signal to something outside of it—without revealing the exact method.
Ideally, this would be in a closely monitored lab setting secured by various hackers to make sure there’s no easy loopholes or backdoors (e.g. no unsecured metallic parts near the machine that can be easily manipulated with magnetic fields), where the unboxer manages to somehow get out anyway.
I can’t even begin to think of anyone capable of this, or any way I would even start to find a solution, and that scares me even more. It probably means I definitely haven’t thought of everything.
I do. It implies that it is actually feasible to construct a text-only channel, which as a programmer I can tell you is not the case.
If you build your AI on an existing OS running on commercial hardware there are going to be countless communication mechanisms and security bugs present for it to take advantage of, and the attack surface of the OS is far too large to secure against even human hackers. The fact that you’ll need multiple machines to run it with current hardware amplifies this problem geometrically, and makes the idea that a real project could achieve complete isolation hopelessly naive. In reality you’ll discover that there was an undocumented Bluetooth chip on one of the motherboards, or the wireless mouse adapter uses a duel-purpose chip that supports WiFi, or one of the power supplies supports HomePNA and there was another device on the grid, or something else along those lines.
The alternative is building your own (very feature-limited) hardware, to run your own (AI-support-only) OS. In theory you might be able to make such a system secure, but in reality no one is ever going to give you the hundreds of millions of $$ it would cost to build the thing. Not to mention that a project that tries this approach will have to spend years duplicating hardware and software work that has already been done a hundred times before, putting it far behind any less cautious competitors...
Maybe I’m missing something obvious, but why wouldn’t physical isolation (a lead-lined bank vault, faraday cage, etc) solve these problems?
Because you can’t create real, 100% physical isolation. At a minimum you’re going to have power lines that breach the walls, and either people moving in and out (while potentially carrying portable electronics) or communication lines going out to terminals that aren’t isolated. Also, this kind of physical facility is very expensive to build, so the more elaborate your plan is the less likely it is to get financed.
Military organizations have been trying to solve these problems ever since the 1950s, with only a modest degree of success. Even paranoid, well-funded organizations with a willingness to shoot people have security breaches on a fairly regular basis.
1) The generator would be in the isolated area.
2) Lead-lined airlock, and obviously portable electronics wouldn’t be allowed in the isolated area.
3) If you have communication lines going to terminals which are not isolated, then you haven’t even made an attempt at isolation in the first place.
4) This is a point about practicalities, not possibilities.
5) The relevant comparison would be the CDC, not the military.