These all make sense. I think you left out the most important set of biases for thinking clearly about alignment (and AI strategy): motivated reasoning and confirmation bias. After studying biases for a few years, I became convinced that motivated reasoning is by far the biggest problem in the world, because it creates polarized beliefs; and that it’s often mislabeled confirmation bias.
I didn’t try to get in to the empirical literature, but it does seem like MR has very large effect sizes, particularly when the real answer is difficult to evaluate.
These all make sense. I think you left out the most important set of biases for thinking clearly about alignment (and AI strategy): motivated reasoning and confirmation bias. After studying biases for a few years, I became convinced that motivated reasoning is by far the biggest problem in the world, because it creates polarized beliefs; and that it’s often mislabeled confirmation bias.
Here’s a longer, but still loose, argument for motivated reasoning as the most important bias to be aware of.
I didn’t try to get in to the empirical literature, but it does seem like MR has very large effect sizes, particularly when the real answer is difficult to evaluate.