My point here is that even conditional on the frame being correct, there are a lot of assumptions like “value is complicated” that I don’t buy, and a lot of these assumptions have a good chance of being false, which significantly impacts the downstream conclusions, and that matters because a lot of LWers probably either hold these beliefs or assume it tacitly in arguments like alignment is hard.
(I think you’re still playing into an incorrect frame by talking about “simplicity” or “speed biases.”)
My point here is that even conditional on the frame being correct, there are a lot of assumptions like “value is complicated” that I don’t buy, and a lot of these assumptions have a good chance of being false, which significantly impacts the downstream conclusions, and that matters because a lot of LWers probably either hold these beliefs or assume it tacitly in arguments like alignment is hard.
Also, for a defense of wrong models, see here:
https://www.lesswrong.com/posts/q5Gox77ReFAy5i2YQ/in-defense-of-probably-wrong-mechanistic-models