Your re-statement of my position is basically accurate. (As an aside, thank you for including it: I was rather surprised how much simpler it made the process of composing a reply to not have to worry about whole classes of misunderstanding.)
I still think there’s some danger in publicly posting arguments like this. Please note, for the record, that I’m not asking you to retract anything. I think retractions do more harm than good, see the Streisand effect. I just hope that this discussion will give pause to you or anyone reading this discussion later, and make them stop to consider what the real-world implications are. Which is not to say I think they’re all negative; in fact, on further reflection, there are more positive aspects than I had originally considered.
In particular, I am concerned that there is a difference between being told “here is a potentially persuasive argument”, and being on the receiving end of that argument in actual use. I believe that the former creates an “immunizing” effect. If a person who believed in boxability heard such arguments in advance, I believe it would increase their likelihood of success as a gatekeeper in the simulation. While this is not true for rational superintelligent actors, that description does not apply to humans. A highly competent AI player might take a combination of approaches, which are effective if presented together, but not if the gatekeeper has seen them before individually and rejected them while failing to update on their likely effectiveness.
At present, the AI has the advantage of being the offensive player. They can prepare in a much more obvious manner, by coming up with arguments exactly like this. The defensive player has to prepare answers to unknown arguments, immunize their thought process against specific non-rational attacks, etc. The question is, if you believe your original argument, how much help is it worth giving to potential future gatekeepers? The obvious response, of course, is that the people that make interesting gatekeepers who we can learn from are exactly the ones who won’t go looking for discussions like this in the first place.
Your re-statement of my position is basically accurate. (As an aside, thank you for including it: I was rather surprised how much simpler it made the process of composing a reply to not have to worry about whole classes of misunderstanding.)
I still think there’s some danger in publicly posting arguments like this. Please note, for the record, that I’m not asking you to retract anything. I think retractions do more harm than good, see the Streisand effect. I just hope that this discussion will give pause to you or anyone reading this discussion later, and make them stop to consider what the real-world implications are. Which is not to say I think they’re all negative; in fact, on further reflection, there are more positive aspects than I had originally considered.
In particular, I am concerned that there is a difference between being told “here is a potentially persuasive argument”, and being on the receiving end of that argument in actual use. I believe that the former creates an “immunizing” effect. If a person who believed in boxability heard such arguments in advance, I believe it would increase their likelihood of success as a gatekeeper in the simulation. While this is not true for rational superintelligent actors, that description does not apply to humans. A highly competent AI player might take a combination of approaches, which are effective if presented together, but not if the gatekeeper has seen them before individually and rejected them while failing to update on their likely effectiveness.
At present, the AI has the advantage of being the offensive player. They can prepare in a much more obvious manner, by coming up with arguments exactly like this. The defensive player has to prepare answers to unknown arguments, immunize their thought process against specific non-rational attacks, etc. The question is, if you believe your original argument, how much help is it worth giving to potential future gatekeepers? The obvious response, of course, is that the people that make interesting gatekeepers who we can learn from are exactly the ones who won’t go looking for discussions like this in the first place.
P.S. I’m also greatly enjoying the meta.