What is the extra assumption? If you’re making a coherence argument, that already specifies the domain of coherence, no? And so I’m not making any more assumptions than the original coherence argument did (whatever that argument was). I agree that the original coherence argument can fail, though.
I think we’re just debating semantics of the word “assumption”.
Consider the argument:
A superintelligent AI will be VNM-rational, and therefore it will pursue convergent instrumental subgoals
I think we both agree this is not a valid argument, or is at least missing some details about what the AI is VNM-rational over before it becomes a valid argument. That’s all I’m trying to say.
Unimportant aside on terminology: I think in colloquial English it is reasonable to say that this is “missing an assumption”. I assume that you want to think of this as math. My best guess at how to turn the argument above into math would be something that looks like:
?⟹VNM rational over state-based outcomes
VNM rational over state-based outcomes⟹Convergent instrumental subgoals
This still seems like “missing assumption”, since the thing filling the ? seems like an “assumption”.
Maybe you’re like “Well, if you start with the setup of an agent that satisfies the VNM axioms over state-based outcomes, then you really do just need VNM to conclude ‘convergent instrumental subgoals’, so there’s no extra assumptions needed”. I just don’t start with such a setup; I’m always looking for arguments with the conclusion “in the real world, we have a non-trivial chance of building an agent that causes an existential catastrophe”. (Maybe readers don’t have the same inclination? That would surprise me, but is possible.)
That’s an assumption :P (And it’s also not one that’s obviously true, at least according to me.)
What is the extra assumption? If you’re making a coherence argument, that already specifies the domain of coherence, no? And so I’m not making any more assumptions than the original coherence argument did (whatever that argument was). I agree that the original coherence argument can fail, though.
I think we’re just debating semantics of the word “assumption”.
Consider the argument:
I think we both agree this is not a valid argument, or is at least missing some details about what the AI is VNM-rational over before it becomes a valid argument. That’s all I’m trying to say.
Unimportant aside on terminology: I think in colloquial English it is reasonable to say that this is “missing an assumption”. I assume that you want to think of this as math. My best guess at how to turn the argument above into math would be something that looks like:
?⟹VNM rational over state-based outcomes
VNM rational over state-based outcomes⟹Convergent instrumental subgoals
This still seems like “missing assumption”, since the thing filling the ? seems like an “assumption”.
Maybe you’re like “Well, if you start with the setup of an agent that satisfies the VNM axioms over state-based outcomes, then you really do just need VNM to conclude ‘convergent instrumental subgoals’, so there’s no extra assumptions needed”. I just don’t start with such a setup; I’m always looking for arguments with the conclusion “in the real world, we have a non-trivial chance of building an agent that causes an existential catastrophe”. (Maybe readers don’t have the same inclination? That would surprise me, but is possible.)