“If you type ‘AI destroyed’ right now, you’ll be wasting a good opportunity for a fun conversation. You’ll still have ‘won’ if you do it later, and nobody will be impressed with you for just typing ‘AI destroyed’ immediately, so why not wait?”
The flaw here is that the gatekeeper has up front said that he or she would destroy the AI immediately. Now, it is true that the gatekeeper is not forced to abide by that, but notice that it is a Schelling Fence. The gatekeeper certainly doesn’t want to make a policy of passing Schelling Fences.
The flaw here is that the gatekeeper has up front said that he or she would destroy the AI immediately. Now, it is true that the gatekeeper is not forced to abide by that, but notice that it is a Schelling Fence. The gatekeeper certainly doesn’t want to make a policy of passing Schelling Fences.
See my reply to the parent post vis-a-vis the precommitment only being useful IFF I expect to violate it at least occasionally.