These are interesting anecdotes but it feels like they could just as easily be used to argue for the opposite conclusion.
That is, your frame here is something like “planning is hard therefore you should distrust alignment plans”.
But you could just as easily frame this as “abstract reasoning about unfamiliar domains is hard therefore you should distrust doom arguments”.
That doesn’t sound right to me.
The reported observation is not just that these particular people people failed at a planning / reasoning task. The reported observation is that they repeatedly made optimistic miscalibrated assumptions, because those assumptions supported a plan.
There’s a more specific reasoning error that’s being posited, beyond “people are often wrong when trying to reason about abstract domains without feedback”. Something like “people will anchor on ideas, if those ideas are necessary for the success of a plan, and they don’t see an alternative plan.”
If that posit is correct, that’s not just an update of “reasoning abstractly is hard and we should widen our confidence intervals / be more uncertain”. We should update to having a much higher evidential bar for the efficacy of plans.
That doesn’t sound right to me.
The reported observation is not just that these particular people people failed at a planning / reasoning task. The reported observation is that they repeatedly made optimistic miscalibrated assumptions, because those assumptions supported a plan.
There’s a more specific reasoning error that’s being posited, beyond “people are often wrong when trying to reason about abstract domains without feedback”. Something like “people will anchor on ideas, if those ideas are necessary for the success of a plan, and they don’t see an alternative plan.”
If that posit is correct, that’s not just an update of “reasoning abstractly is hard and we should widen our confidence intervals / be more uncertain”. We should update to having a much higher evidential bar for the efficacy of plans.