simon comments on Is “VNM-agent” one of several options, for what minds can grow up into?

simon 30 Dec 2024 21:41 UTC
4 points
0
I feel like this discussion could do with some disambiguation of what “VNM rationality” means.
VNM assumes consequentialism. If you define consequentialism narrowly, this has specific results in terms of instrumental convergence.

You can redefine what constitutes a consequence arbitrarily. But, along the lines of what Steven Byrnes points out in his comment, redefining this can get rid of instrumental convergence. In the extreme case you can define a utility function for literally any pattern of behaviour.
When you say you feel like you can’t be dutch booked, you are at least implicitly assuming some definition of consequences you can’t be dutch booked in terms of. To claim that one is rationally required to adopt any particular definition of consequences in your utility function is basically circular, since you only care about being dutch booked according to it if you actually care about that definition of consequences. It’s in this sense that the VNM theorem is trivial.
BTW I am concerned that self-modifying AIs may self-modify towards VNM-0 agents.

But the reason is not because such self modification is “rational”.

It’s just that (narrowly defined) consequentialist agents care about preserving and improving their abilities to and proclivities to pursue their consequentialist goals, so tendencies towards VNM-0 will be reinforced in a feedback loop. Likewise for inter-agent competition.