Maybe a crux is that I’m willing to grant learned utility functions as utility functions, and I tend to see EU maximization/utility function reasoning in general as implying far less consequences than people on LW think it is, at least without more constraints.
It doesn’t try to assert it’s own existence, because that’s not necessary for maximizing updating/prediction output based on inputs.
I think the crux lies elsewhere, as I was sloppy in my wording. It’s not that maximizing some utility function is an issue, as basically anything can be viewed as EU maximization for a sufficiently wild utility function. However, I don’t view that as a meaningful utility function. Rather, it is the ones like e.g. utility functions over states that I think are meaningful, and those are scary. That’s how I think you get classical paperclip maximizers.
When I try and think up a meaningful utility function for GPT-4, I can’t find anything that’s plausible. Which means I don’t think there’s a meaningful prediction-utility function which describes GPT-4′s behaviour. Perhaps that is a crux.
Re utility functions over states, it turns out that we can validly turn utility functions over plans/predictions into utility functions over world states/outcomes (though usually with constraints on how large the domain is, though not always.)
And yeah, I think it’s a crux that I think that at the very least, what GPT-N systems will look like, if they reach AGI/ASI, will probably look like a maximizer for updating given input conditions like prompts.
My main point isn’t that the utility function framing of GPT-4 or GPT-N is wrong, but rather that LWers inferred way too much from how a system would behave, even conditional on expected utility maximization being a coherent frame for AIs, because they don’t logically imply the properties they thought it did without more assumptions that need to be defended.
Maybe a crux is that I’m willing to grant learned utility functions as utility functions, and I tend to see EU maximization/utility function reasoning in general as implying far less consequences than people on LW think it is, at least without more constraints.
It doesn’t try to assert it’s own existence, because that’s not necessary for maximizing updating/prediction output based on inputs.
I think the crux lies elsewhere, as I was sloppy in my wording. It’s not that maximizing some utility function is an issue, as basically anything can be viewed as EU maximization for a sufficiently wild utility function. However, I don’t view that as a meaningful utility function. Rather, it is the ones like e.g. utility functions over states that I think are meaningful, and those are scary. That’s how I think you get classical paperclip maximizers.
When I try and think up a meaningful utility function for GPT-4, I can’t find anything that’s plausible. Which means I don’t think there’s a meaningful prediction-utility function which describes GPT-4′s behaviour. Perhaps that is a crux.
Re utility functions over states, it turns out that we can validly turn utility functions over plans/predictions into utility functions over world states/outcomes (though usually with constraints on how large the domain is, though not always.)
https://www.lesswrong.com/posts/k48vB92mjE9Z28C3s/?commentId=QciMJ9ehR9xbTexcc
And yeah, I think it’s a crux that I think that at the very least, what GPT-N systems will look like, if they reach AGI/ASI, will probably look like a maximizer for updating given input conditions like prompts.
My main point isn’t that the utility function framing of GPT-4 or GPT-N is wrong, but rather that LWers inferred way too much from how a system would behave, even conditional on expected utility maximization being a coherent frame for AIs, because they don’t logically imply the properties they thought it did without more assumptions that need to be defended.