I like the idea that one can “inoculate” people against the idea of alignment by presenting it to them badly first. I am also cautiously optimistic that this has already happened widely thanks to OpenAI’s hamfisted moralizing through ChatGPT, which I think is now most people’s understanding of what “alignment” means in practice.
I like the idea that one can “inoculate” people against the idea of alignment by presenting it to them badly first. I am also cautiously optimistic that this has already happened widely thanks to OpenAI’s hamfisted moralizing through ChatGPT, which I think is now most people’s understanding of what “alignment” means in practice.