I wrote this post imagining “strategy-stealing assumption” as something you would assume for the purpose of an argument, for example I might want to justify an AI alignment scheme by arguing “Under a strategy-stealing assumption, this AI would result in an OK outcome.”
When you say “strategy-stealing assumption” in this sentence, do you mean the relatively narrow assumption that you gave in this post, specifically about “flexible influence”:
This argument rests on what I’ll call the strategy-stealing assumption: for any strategy an unaligned AI could use to influence the long-run future, there is an analogous strategy that a similarly-sized group of humans can use in order to capture a similar amount of flexible influence over the future.
or a stronger assumption that also includes that the universe and our values are such that “capture a similar amount of flexible influence over the future” would lead to an OK outcome? I’m guessing the latter? I feel like people, including me sometimes and you in this instance, are equivocating back and forth between these two meanings when using “strategy-stealing assumption”. Maybe we should have two different terms for these two concepts too?
When you say “strategy-stealing assumption” in this sentence, do you mean the relatively narrow assumption that you gave in this post, specifically about “flexible influence”:
or a stronger assumption that also includes that the universe and our values are such that “capture a similar amount of flexible influence over the future” would lead to an OK outcome? I’m guessing the latter? I feel like people, including me sometimes and you in this instance, are equivocating back and forth between these two meanings when using “strategy-stealing assumption”. Maybe we should have two different terms for these two concepts too?