I don’t care about “me”, I care about hypothetical gatekeeper “X”.
Even if my ego prevents me from accepting that I might be persuaded by “Y”, I can easily admit that “X” could be persuaded by “Y”. In this case, exhibiting a particular “Y” that seems like it could persuade “X” is an excellent argument against creating the situation that allows “X” to be persuaded by “Y”. The more and varied the “Y” we can produce, the less smart putting humans in this situation looks. And isn’t that what we’re trying to argue here? That AI-boxing isn’t safe because people will be convinced by “Y”?
We do this all the time in arguing for why certain political powers shouldn’t be given. “The corrupting influence of power” is a widely accepted argument against having benign dictators, even if we think we’re personally exempt. How could you say “Dictators would do bad things because of Y, but I can’t even tell you Y because you’d claim that you wouldn’t fall for it” and expect to persuade anyone?
And if you posit that doing Z is sufficiently bad, then you don’t need recourse to any exotic arguments to show that we shouldn’t give people the option of doing Z. Eventually someone will do Z for money, or from fear, or because God told them to do Z, or maybe there’s just really stupid. I’m a little peeved I can’t geek out of the cool arguments people are coming up with because of this obscurantism.
There are other arguments I can think of for not sharing strong strategies, but they are either cynical or circular. Cynical explanations are obvious. On circular arguments: Isn’t an argument for letting the AI out of the box an argument for building the AI in the first place? Isn’t that the whole shtick here?
I don’t understand.
I don’t care about “me”, I care about hypothetical gatekeeper “X”.
Even if my ego prevents me from accepting that I might be persuaded by “Y”, I can easily admit that “X” could be persuaded by “Y”. In this case, exhibiting a particular “Y” that seems like it could persuade “X” is an excellent argument against creating the situation that allows “X” to be persuaded by “Y”. The more and varied the “Y” we can produce, the less smart putting humans in this situation looks. And isn’t that what we’re trying to argue here? That AI-boxing isn’t safe because people will be convinced by “Y”?
We do this all the time in arguing for why certain political powers shouldn’t be given. “The corrupting influence of power” is a widely accepted argument against having benign dictators, even if we think we’re personally exempt. How could you say “Dictators would do bad things because of Y, but I can’t even tell you Y because you’d claim that you wouldn’t fall for it” and expect to persuade anyone?
And if you posit that doing Z is sufficiently bad, then you don’t need recourse to any exotic arguments to show that we shouldn’t give people the option of doing Z. Eventually someone will do Z for money, or from fear, or because God told them to do Z, or maybe there’s just really stupid. I’m a little peeved I can’t geek out of the cool arguments people are coming up with because of this obscurantism.
There are other arguments I can think of for not sharing strong strategies, but they are either cynical or circular. Cynical explanations are obvious. On circular arguments: Isn’t an argument for letting the AI out of the box an argument for building the AI in the first place? Isn’t that the whole shtick here?