My main take on Bayesian epistemology being wrong is that I think to the extent it’s useless in real life, it’s because it focuses way too much on the ideal case, ala @Robert Miles’s tweet here:
(The other problem I have with it is that even in the ideal case, it doesn’t have a way to sensibly handle 0 probability events, or conditioning on probability 0 events, which can actually happen once we leave the world of finite sets and measures.)
That said, I don’t think that people being wrong about epistemology is the cause of high p(Doom).
I’d agree more with @Algon in that the issues lie elsewhere (though a nitpick is that I wouldn’t say that EU maximization is wrong for TAI/AGI/ASI, but rather that certain dangerous properties don’t automatically hold, and that systems that EU maximize IRL like GPT-4 aren’t actually nearly as dangerous as often assumed. Agree with the other points.)
What I was talking about is that the predictive models like GPT-4 have a utility function that’s essentially predictive, and the maximization is essentially trying to update the best it can given input conditions.
These posts can help you to understand more about predictive/simulator utility functions like GPT-4:
The ideal predictor’s utility function is instead strictly over the model’s own outputs, conditional on inputs.
I’m doubtful that GPT-4 has a utility function. If it did, I would be kind-of terrified. I don’t think I’ve seen the posts you linked to though, so I’ll go read those.
Maybe a crux is that I’m willing to grant learned utility functions as utility functions, and I tend to see EU maximization/utility function reasoning in general as implying far less consequences than people on LW think it is, at least without more constraints.
It doesn’t try to assert it’s own existence, because that’s not necessary for maximizing updating/prediction output based on inputs.
I think the crux lies elsewhere, as I was sloppy in my wording. It’s not that maximizing some utility function is an issue, as basically anything can be viewed as EU maximization for a sufficiently wild utility function. However, I don’t view that as a meaningful utility function. Rather, it is the ones like e.g. utility functions over states that I think are meaningful, and those are scary. That’s how I think you get classical paperclip maximizers.
When I try and think up a meaningful utility function for GPT-4, I can’t find anything that’s plausible. Which means I don’t think there’s a meaningful prediction-utility function which describes GPT-4′s behaviour. Perhaps that is a crux.
Re utility functions over states, it turns out that we can validly turn utility functions over plans/predictions into utility functions over world states/outcomes (though usually with constraints on how large the domain is, though not always.)
And yeah, I think it’s a crux that I think that at the very least, what GPT-N systems will look like, if they reach AGI/ASI, will probably look like a maximizer for updating given input conditions like prompts.
My main point isn’t that the utility function framing of GPT-4 or GPT-N is wrong, but rather that LWers inferred way too much from how a system would behave, even conditional on expected utility maximization being a coherent frame for AIs, because they don’t logically imply the properties they thought it did without more assumptions that need to be defended.
My main take on Bayesian epistemology being wrong is that I think to the extent it’s useless in real life, it’s because it focuses way too much on the ideal case, ala @Robert Miles’s tweet here:
https://x.com/robertskmiles/status/1830925270066286950
(The other problem I have with it is that even in the ideal case, it doesn’t have a way to sensibly handle 0 probability events, or conditioning on probability 0 events, which can actually happen once we leave the world of finite sets and measures.)
That said, I don’t think that people being wrong about epistemology is the cause of high p(Doom).
I’d agree more with @Algon in that the issues lie elsewhere (though a nitpick is that I wouldn’t say that EU maximization is wrong for TAI/AGI/ASI, but rather that certain dangerous properties don’t automatically hold, and that systems that EU maximize IRL like GPT-4 aren’t actually nearly as dangerous as often assumed. Agree with the other points.)
(I am not the iniminatable @Robert Miles, though we do have some things in common.)
Reply to @Algon:
What I was talking about is that the predictive models like GPT-4 have a utility function that’s essentially predictive, and the maximization is essentially trying to update the best it can given input conditions.
These posts can help you to understand more about predictive/simulator utility functions like GPT-4:
https://www.lesswrong.com/posts/vs49tuFuaMEd4iskA/one-path-to-coherence-conditionalization
https://www.lesswrong.com/posts/k48vB92mjE9Z28C3s/implied-utilities-of-simulators-are-broad-dense-and-shallow
https://www.lesswrong.com/posts/EBKJq2gkhvdMg5nTQ/instrumentality-makes-agents-agenty
I’m doubtful that GPT-4 has a utility function. If it did, I would be kind-of terrified. I don’t think I’ve seen the posts you linked to though, so I’ll go read those.
Maybe a crux is that I’m willing to grant learned utility functions as utility functions, and I tend to see EU maximization/utility function reasoning in general as implying far less consequences than people on LW think it is, at least without more constraints.
It doesn’t try to assert it’s own existence, because that’s not necessary for maximizing updating/prediction output based on inputs.
I think the crux lies elsewhere, as I was sloppy in my wording. It’s not that maximizing some utility function is an issue, as basically anything can be viewed as EU maximization for a sufficiently wild utility function. However, I don’t view that as a meaningful utility function. Rather, it is the ones like e.g. utility functions over states that I think are meaningful, and those are scary. That’s how I think you get classical paperclip maximizers.
When I try and think up a meaningful utility function for GPT-4, I can’t find anything that’s plausible. Which means I don’t think there’s a meaningful prediction-utility function which describes GPT-4′s behaviour. Perhaps that is a crux.
Re utility functions over states, it turns out that we can validly turn utility functions over plans/predictions into utility functions over world states/outcomes (though usually with constraints on how large the domain is, though not always.)
https://www.lesswrong.com/posts/k48vB92mjE9Z28C3s/?commentId=QciMJ9ehR9xbTexcc
And yeah, I think it’s a crux that I think that at the very least, what GPT-N systems will look like, if they reach AGI/ASI, will probably look like a maximizer for updating given input conditions like prompts.
My main point isn’t that the utility function framing of GPT-4 or GPT-N is wrong, but rather that LWers inferred way too much from how a system would behave, even conditional on expected utility maximization being a coherent frame for AIs, because they don’t logically imply the properties they thought it did without more assumptions that need to be defended.