I read Yudkowsky as positing some kind of conservation law. Something like, if the plans produced by your AI succeed at having specifically chosen far-reaching consequences if implemented, then the AI must have done reasoning about far-reaching consequences.
Why this seems true:
Any planning process which robustly succeeds must behave differently in the presence of different latent problems.
If I’m going to the store and one of two routes may be closed down, and I want to always arrive at the store, my plan must somehow behave differently in the presence of the two possible latent complications (the road which is closed).
A pivotal act requires a complicated plan with lots of possible latent problems.
Any implementing process (like an AI) which robustly enacts a complicated plan (like destroying unaligned AGIs) must somehow behave differently in the presence of many different problems (like the designers trying to shut down the AI).
Thus, robustly pulling off a pivotal act requires some kind of “reasoning about far-reaching consequences” on the latent world state.
Why this seems true:
Any planning process which robustly succeeds must behave differently in the presence of different latent problems.
If I’m going to the store and one of two routes may be closed down, and I want to always arrive at the store, my plan must somehow behave differently in the presence of the two possible latent complications (the road which is closed).
A pivotal act requires a complicated plan with lots of possible latent problems.
Any implementing process (like an AI) which robustly enacts a complicated plan (like destroying unaligned AGIs) must somehow behave differently in the presence of many different problems (like the designers trying to shut down the AI).
Thus, robustly pulling off a pivotal act requires some kind of “reasoning about far-reaching consequences” on the latent world state.
Yep, I agree with that. That’s orthogonal to myopia as I use the term, though.