Seth Herd comments on Optimistic Assumptions, Longterm Planning, and “Cope”

Seth Herd 20 Jul 2024 21:42 UTC
2 points
0
Yes to your first point. I think that
abstract reasoning about unfamiliar domains is hard therefore you should distrust doom arguments
Is a fair characterization of those results. So would be the inverse, “abstract reasoning about unfamiliar domains is hard therefore you should distrust AI success arguments”.
I think both are very true, and so we should distrust both. We simply don’t know.
I think the conclusion taken,
planning is hard therefore you should distrust alignment plans
Is also valid and true.
People just aren’t as smart as we’d like to think we are, particularly in reasoning about complex and unfamiliar domains. So both our plans and evaluations of them tend to be more untrustworthy than we’d like to think. Planning and reasoning require way more collective effort than we’d like to imagine. Careful studies of both individual reasoning in lab tasks and historical examples support this conclusion.
One major reason for this miscalibration is the motivated reasoning effect. We tend to believe what feels good (predicts local reward). Overestimating our reasoning abilities is one such belief among vary many examples of motivated reasoning.