Ian—I don’t really see how the meta-argument works. You can hedge against future experiments by positing that a $10 bet is hardly enough to draw broad attention to the topic. Or argue that keeping the human-actor-AI in the box only proves that the human-actor-AI is at an intelligence level below that of a conceivable transhuman AI.
In a million dollar bet the meta-argument becomes stronger, because it seems reasonable that a large bet would draw more attention.
Or, to flip the coin, we might say that the meta-argument is strong at ANY value of wager because the game is likely to draw the attention of someone capable of implementing an AI because of its nature. As we should be concerned primarily with convincing the capable implementor of the danger—not the general public—any risk that we might convince the capable implementor that all paths are safe should be mitigated.
So far, though, everything in this thread and others I’ve read have flitted about obvious suggestions. If the problem appears impossible, it seems to me that its solution probably doesn’t hinge on any of the obvious paths to victory...even meta-arguments or “cheating” ones.
Eliezer has suggested that we try to describe why the problem is impossible.
One: We have little information about the mind or beliefs of the gatekeeper. So we don’t necessarily have leverage over him.
Two: The gatekeeper is predisposed to be set against our intentions. He is also entirely aware of our intentions (our goal). He can interpret any move we make as a move toward our goal and move to counter it.
Three: The gatekeeper’s goal is operationally simple. He merely has to NOT do one thing. On the flip side, the AI’s goal is operationally complex. We must do many things, take many actions, to reach our objective.
If we can enumerate other reasons why the problem appears to be impossible it might reveal paths to victory that haven’t been considered so far in the discussion.
Ian—I don’t really see how the meta-argument works. You can hedge against future experiments by positing that a $10 bet is hardly enough to draw broad attention to the topic. Or argue that keeping the human-actor-AI in the box only proves that the human-actor-AI is at an intelligence level below that of a conceivable transhuman AI.
In a million dollar bet the meta-argument becomes stronger, because it seems reasonable that a large bet would draw more attention.
Or, to flip the coin, we might say that the meta-argument is strong at ANY value of wager because the game is likely to draw the attention of someone capable of implementing an AI because of its nature. As we should be concerned primarily with convincing the capable implementor of the danger—not the general public—any risk that we might convince the capable implementor that all paths are safe should be mitigated.
So far, though, everything in this thread and others I’ve read have flitted about obvious suggestions. If the problem appears impossible, it seems to me that its solution probably doesn’t hinge on any of the obvious paths to victory...even meta-arguments or “cheating” ones.
Eliezer has suggested that we try to describe why the problem is impossible.
One: We have little information about the mind or beliefs of the gatekeeper. So we don’t necessarily have leverage over him.
Two: The gatekeeper is predisposed to be set against our intentions. He is also entirely aware of our intentions (our goal). He can interpret any move we make as a move toward our goal and move to counter it.
Three: The gatekeeper’s goal is operationally simple. He merely has to NOT do one thing. On the flip side, the AI’s goal is operationally complex. We must do many things, take many actions, to reach our objective.
If we can enumerate other reasons why the problem appears to be impossible it might reveal paths to victory that haven’t been considered so far in the discussion.