Re utility functions over states, it turns out that we can validly turn utility functions over plans/predictions into utility functions over world states/outcomes (though usually with constraints on how large the domain is, though not always.)
And yeah, I think it’s a crux that I think that at the very least, what GPT-N systems will look like, if they reach AGI/ASI, will probably look like a maximizer for updating given input conditions like prompts.
My main point isn’t that the utility function framing of GPT-4 or GPT-N is wrong, but rather that LWers inferred way too much from how a system would behave, even conditional on expected utility maximization being a coherent frame for AIs, because they don’t logically imply the properties they thought it did without more assumptions that need to be defended.
Re utility functions over states, it turns out that we can validly turn utility functions over plans/predictions into utility functions over world states/outcomes (though usually with constraints on how large the domain is, though not always.)
https://www.lesswrong.com/posts/k48vB92mjE9Z28C3s/?commentId=QciMJ9ehR9xbTexcc
And yeah, I think it’s a crux that I think that at the very least, what GPT-N systems will look like, if they reach AGI/ASI, will probably look like a maximizer for updating given input conditions like prompts.
My main point isn’t that the utility function framing of GPT-4 or GPT-N is wrong, but rather that LWers inferred way too much from how a system would behave, even conditional on expected utility maximization being a coherent frame for AIs, because they don’t logically imply the properties they thought it did without more assumptions that need to be defended.