theorems which state that, unless an agent can be represented as maximizing expected utility, that agent is liable to pursue strategies that are dominated by some other available strategy.
While I agree that such theorems would count as coherence theorems, I wouldn’t consider this to cover most things I think of as coherence theorems, and as such is simply a bad definition.
I think of coherence theorems loosely as things that say if an agent follows such and such principles, then we can prove it will have a certain property. The usefulness comes from both directions: to the extent the principles seem like good things to have, we’re justified in assuming a certain property, and to the extent that the property seems too strong or whatever, then one of these principles will have to break.
I think of coherence theorems loosely as things that say if an agent follows such and such principles, then we can prove it will have a certain property.
If you use this definition, then VNM (etc.) counts as a coherence theorem. But Premise 1 of the coherence argument (as I’ve rendered it) remains false, and so you can’t use the coherence argument to get the conclusion that sufficiently-advanced artificial agents will be representable as maximizing expected utility.
I don’t think the majority of the papers that you cite made the argument that coherence arguments prove that any sufficiently-advanced AI will be representable as maximizing expected utility. Indeed I am very confident almost everyone you cite does not believe this, since it is a very strong claim. Many of the quotes you give even explicitly say this:
then you will make strictly worse choices by your own lights than if you followed some alternate EU-maximizing strategy (at least in some situations, though they may not arise)
The emphasis here is important.
I don’t think really any of the other quotes you cite make the strong claim you are arguing against. Indeed it is trivially easy to think of an extremely powerful AI that is VNM rational in all situations except for one tiny thing that does not matter or will never come up. Technically it’s preferences can now not be represented by a utility function, but that’s not very relevant to the core arguments at hand, and I feel like in your arguments you are trying to tear down some strawman of some extreme position that I don’t think anyone holds.
Eliezer has also explicitly written about it being possible to design superintelligences that reflectively coherently believe in logical falsehoods. He thinks this is possible, just very difficult. That alone would also violate VNM rationality.
You misunderstand me (and I apologize for that. I now think I should have made this clear in the post). I’m arguing against the following weak claim:
For any agent who cannot be represented as maximizing expected utility, there is at least some situation in which that agent will pursue a dominated strategy.
And my argument is:
There are no theorems which state or imply that claim. VNM doesn’t, Savage doesn’t, Bolker-Jeffrey doesn’t, Dutch Books don’t, Cox doesn’t, Complete Class doesn’t.
Money-pump arguments for the claim are not particularly convincing (for the reasons that I give in the post).
‘The relevant situations may not arise’ is a different objection. It’s not the one that I’m making.
While I agree that such theorems would count as coherence theorems, I wouldn’t consider this to cover most things I think of as coherence theorems, and as such is simply a bad definition.
I think of coherence theorems loosely as things that say if an agent follows such and such principles, then we can prove it will have a certain property. The usefulness comes from both directions: to the extent the principles seem like good things to have, we’re justified in assuming a certain property, and to the extent that the property seems too strong or whatever, then one of these principles will have to break.
If you use this definition, then VNM (etc.) counts as a coherence theorem. But Premise 1 of the coherence argument (as I’ve rendered it) remains false, and so you can’t use the coherence argument to get the conclusion that sufficiently-advanced artificial agents will be representable as maximizing expected utility.
I don’t think the majority of the papers that you cite made the argument that coherence arguments prove that any sufficiently-advanced AI will be representable as maximizing expected utility. Indeed I am very confident almost everyone you cite does not believe this, since it is a very strong claim. Many of the quotes you give even explicitly say this:
The emphasis here is important.
I don’t think really any of the other quotes you cite make the strong claim you are arguing against. Indeed it is trivially easy to think of an extremely powerful AI that is VNM rational in all situations except for one tiny thing that does not matter or will never come up. Technically it’s preferences can now not be represented by a utility function, but that’s not very relevant to the core arguments at hand, and I feel like in your arguments you are trying to tear down some strawman of some extreme position that I don’t think anyone holds.
Eliezer has also explicitly written about it being possible to design superintelligences that reflectively coherently believe in logical falsehoods. He thinks this is possible, just very difficult. That alone would also violate VNM rationality.
You misunderstand me (and I apologize for that. I now think I should have made this clear in the post). I’m arguing against the following weak claim:
For any agent who cannot be represented as maximizing expected utility, there is at least some situation in which that agent will pursue a dominated strategy.
And my argument is:
There are no theorems which state or imply that claim. VNM doesn’t, Savage doesn’t, Bolker-Jeffrey doesn’t, Dutch Books don’t, Cox doesn’t, Complete Class doesn’t.
Money-pump arguments for the claim are not particularly convincing (for the reasons that I give in the post).
‘The relevant situations may not arise’ is a different objection. It’s not the one that I’m making.