This seems somewhat connected to this previous argument. Basically, coherent agents can be modeled as utility-optimizers, yes, but what this really proves is that almost any behavior fits into the model “utility-optimizer”, not that coherent agents must necessarily look like our intuitive picture of a utility-optimizer.
Paraphrasing Rohin’s arguments somewhat, the arguments for universal convergence say something like “for “most” “natural” utility functions, optimizing that function will mean acquiring power, killing off adversaries, acquiring resources, etc”. We know that all coherent behavior comes from a utility function, but it doesn’t follow that most coherent behavior exhibits this sort of power-seeking.
This seems somewhat connected to this previous argument. Basically, coherent agents can be modeled as utility-optimizers, yes, but what this really proves is that almost any behavior fits into the model “utility-optimizer”, not that coherent agents must necessarily look like our intuitive picture of a utility-optimizer.
Paraphrasing Rohin’s arguments somewhat, the arguments for universal convergence say something like “for “most” “natural” utility functions, optimizing that function will mean acquiring power, killing off adversaries, acquiring resources, etc”. We know that all coherent behavior comes from a utility function, but it doesn’t follow that most coherent behavior exhibits this sort of power-seeking.