I have a hard time judging correctly whether I’d consider it similarly counter-productive under this context with this heuristic.
My initial reaction to both experiments is that both situations defeat the point of researching and building a Friendly AI in the first place—we want it to solve problems we don’t understand or solve them faster than we could, and it’s unlikely that we would indefinitely understand all such solutions even after careful evaluation, at which point we might as well be giving the AI unbridled access to molecular 3D printers (the magic postulate applies here too).
So that’s one way discussing them is somewhat counter-productive. Another way is in the examples it brings up constraining our thinking once some are made available. However, the text-only boxed AI was brought up as a “legitimate safety proposal”, from what I heard, and from there came about the AI Box Experiment that E.Y. performed, to show that we most likely haven’t thought of every single possibility.
However, the point of the experiment was to demonstrate that it was done, without revealing specific examples. E.Y. revealing the specific method he used to unbox himself seems as counterproductive, IMO, as revealing here specific hypotheses we might have for unboxing in this scenario.
What could be productive, however, is a human making a similar demonstration by making a disconnected, isolated machine somehow send a signal to something outside of it—without revealing the exact method.
Ideally, this would be in a closely monitored lab setting secured by various hackers to make sure there’s no easy loopholes or backdoors (e.g. no unsecured metallic parts near the machine that can be easily manipulated with magnetic fields), where the unboxer manages to somehow get out anyway.
I can’t even begin to think of anyone capable of this, or any way I would even start to find a solution, and that scares me even more. It probably means I definitely haven’t thought of everything.
Hmm, good pointy question.
I have a hard time judging correctly whether I’d consider it similarly counter-productive under this context with this heuristic.
My initial reaction to both experiments is that both situations defeat the point of researching and building a Friendly AI in the first place—we want it to solve problems we don’t understand or solve them faster than we could, and it’s unlikely that we would indefinitely understand all such solutions even after careful evaluation, at which point we might as well be giving the AI unbridled access to molecular 3D printers (the magic postulate applies here too).
So that’s one way discussing them is somewhat counter-productive. Another way is in the examples it brings up constraining our thinking once some are made available. However, the text-only boxed AI was brought up as a “legitimate safety proposal”, from what I heard, and from there came about the AI Box Experiment that E.Y. performed, to show that we most likely haven’t thought of every single possibility.
However, the point of the experiment was to demonstrate that it was done, without revealing specific examples. E.Y. revealing the specific method he used to unbox himself seems as counterproductive, IMO, as revealing here specific hypotheses we might have for unboxing in this scenario.
What could be productive, however, is a human making a similar demonstration by making a disconnected, isolated machine somehow send a signal to something outside of it—without revealing the exact method.
Ideally, this would be in a closely monitored lab setting secured by various hackers to make sure there’s no easy loopholes or backdoors (e.g. no unsecured metallic parts near the machine that can be easily manipulated with magnetic fields), where the unboxer manages to somehow get out anyway.
I can’t even begin to think of anyone capable of this, or any way I would even start to find a solution, and that scares me even more. It probably means I definitely haven’t thought of everything.