I think there’s an issue where Alice wants to be told what to do to help with AI safety. So then Bob tells Alice to do X, and then Alice does X, and since X came from Bob, the possibility of Alice helping manifest the nascent future paradigms of alignment is lost. Or Carol tells Alice that AI alignment is a pre-paradigmatic field and she should think for herself. So then Alice thinks outside the box, in the way she already knows how to think outside the box; empirically, most of most people’s idea generation is surprisingly unsurprising. Again, this loses the potential for Alice to actually deal with what matters.
Not sure what could be done about this. One thing is critique. E.g., instead of asking “what are some ways this proposed FAI design could go wrong”, asking “what is the deepest, most general way this must go wrong?”, so that you can update away from that entire class of doomed ideas, and potentially have more interesting babble.
I think there’s an issue where Alice wants to be told what to do to help with AI safety. So then Bob tells Alice to do X, and then Alice does X, and since X came from Bob, the possibility of Alice helping manifest the nascent future paradigms of alignment is lost. Or Carol tells Alice that AI alignment is a pre-paradigmatic field and she should think for herself. So then Alice thinks outside the box, in the way she already knows how to think outside the box; empirically, most of most people’s idea generation is surprisingly unsurprising. Again, this loses the potential for Alice to actually deal with what matters.
Not sure what could be done about this. One thing is critique. E.g., instead of asking “what are some ways this proposed FAI design could go wrong”, asking “what is the deepest, most general way this must go wrong?”, so that you can update away from that entire class of doomed ideas, and potentially have more interesting babble.