Oh I see where you’re coming from now. I’ll admit that, when I made my earlier post, I forgot about the full implications of instrumental convergence. Specifically, the part where:
Maximizing X minimizes alll Not X insofar as they both compete for the same resource pool.
Even if your resources are unusually low relative to where you’re positioned in the universe, an AI will still take that away from you. Optimizing one utility function doesn’t just randomly affect the optimization of other utility functions; they are anti-correlated in general
Well, they’re anti-correlated across different agents. But from the same agent’s perspective, they may still be able to maximize their own red-seeing, or even human red-seeing—they just won’t. (This will be in the next part of my sequence on impact).
Well, they’re anti-correlated across different agents. But from the same agent’s perspective, they may still be able to maximize their own red-seeing, or even human red-seeing—they just won’t
Just making sure I can parse this… When I say that they’re anti-correlated, I mean that the policy of maximizing X is akin to the policy of minimizing X to the extent that X and not X will at some point compete for the same instrumental resources. I will agree with the statement that an agent maximizing X who possesses many instrumental resources can use them to accomplish not X (and ,in this sense, the agent doesn’t perceive X nd not X as anti-correlated); and I’ll also agree that an agent optimizing X and another optimizing not X will be competitive for instrumental resources and view those things as anti-correlated.
they may still be able to maximize their own red-seeing, or even human red-seeing—they just won’t
I think some of this is a matter of semantics but I think I agree with this. There are also two different definitions of the word able here:
Able #1 : The extent to which it is possible for an agent to achieve X across all possible universes we think we might reside in
Able #2 : The extent to which it is possble for an agent to achieve X in a counterfactual where the agent has a goal of achieving X
I think you’re using Able #2 (which makes sense—it’s how the word is used colloquially). I tend to use Able #1 (because I read a lot about determinism when I was younger). I might be wrong about this though because you made a similar distinction between physical capability and anticipated possibility like this in Gears of Impact:
People have a natural sense of what they “could” do. If you’re sad, it still feels like you “could” do a ton of work anyways. It doesn’t feel physically impossible.”
...
Imagine suddenly becoming not-sad. Now you “could” work when you’re sad, and you “could” work when you’re not-sad, so if AU just compared the things you “could” do, you wouldn’t feel impact here.
I think you’re using Able #2 (which makes sense—it’s how the word is used colloquially). I tend to use Able #1 (because I read a lot about determinism when I was younger). I might be wrong about this though because you made a similar distinction between physical capability and anticipated possibility like this in Gears of Impact:
I am using #2, but I’m aware that there’s a separate #1 meaning (and thank you for distinguishing between them so clearly, here!).
Oh I see where you’re coming from now. I’ll admit that, when I made my earlier post, I forgot about the full implications of instrumental convergence. Specifically, the part where:
Maximizing X minimizes alll Not X insofar as they both compete for the same resource pool.
Even if your resources are unusually low relative to where you’re positioned in the universe, an AI will still take that away from you. Optimizing one utility function doesn’t just randomly affect the optimization of other utility functions; they are anti-correlated in general
I really gotta re-read Goodhart’s Taxonomy for a fourth time...
Well, they’re anti-correlated across different agents. But from the same agent’s perspective, they may still be able to maximize their own red-seeing, or even human red-seeing—they just won’t. (This will be in the next part of my sequence on impact).
Just making sure I can parse this… When I say that they’re anti-correlated, I mean that the policy of maximizing X is akin to the policy of minimizing X to the extent that X and not X will at some point compete for the same instrumental resources. I will agree with the statement that an agent maximizing X who possesses many instrumental resources can use them to accomplish not X (and ,in this sense, the agent doesn’t perceive X nd not X as anti-correlated); and I’ll also agree that an agent optimizing X and another optimizing not X will be competitive for instrumental resources and view those things as anti-correlated.
I think some of this is a matter of semantics but I think I agree with this. There are also two different definitions of the word able here:
Able #1 : The extent to which it is possible for an agent to achieve X across all possible universes we think we might reside in
Able #2 : The extent to which it is possble for an agent to achieve X in a counterfactual where the agent has a goal of achieving X
I think you’re using Able #2 (which makes sense—it’s how the word is used colloquially). I tend to use Able #1 (because I read a lot about determinism when I was younger). I might be wrong about this though because you made a similar distinction between physical capability and anticipated possibility like this in Gears of Impact:
I am using #2, but I’m aware that there’s a separate #1 meaning (and thank you for distinguishing between them so clearly, here!).