I have doubts that goals of a superintelligence are predictable by us.
Do you mean intrinsic (top-level, static) goals, or instrumental ones (subgoals)? Bostrom in this chapter is concerned with the former, and there’s no particular reason those have to get complicated. You could certainly have a human-level intelligence that only inherently cared about eating food and having sex, though humans are not that kind of being.
Instrumental goals are indeed likely to get more complicated as agents become more intelligent and can devise more involved schemes to achieve their intrinsic values, but you also don’t really need to understand them in detail to make useful predictions about the consequences of an intelligence’s behavior.
Do you mean intrinsic (top-level, static) goals, or instrumental ones (subgoals)? Bostrom in this chapter is concerned with the former, and there’s no particular reason those have to get complicated.
I mean terminal, top-level (though not necessarily static) goals.
As to “no reason to get complicated”, how would you know? Note that I’m talking about a superintelligence, which is far beyond human level.
As to “no reason to get complicated”, how would you know?
It’s a direct consequence of the orthogonality thesis. Bostrom (reasonably enough) supposes that there might be a limit in the opposite direction—to hold a goal you do need to be able to model it to some degree, so agent intelligence may set an upper bound on the complexity of goals the agent can hold—but there’s no corresponding reason for a limit in the opposite direction: Intelligent agents can understand simple goals just fine. I don’t have a problem reasoning about what a cow is trying to do, and I could certainly optimize towards the same had my mind been constructed to only want those things.
How would you know that there’s no reason for terminal goals of a superintelligence “to get complicated” if humans, being “simple agents” in this context, are not sufficiently intelligent to consider highly complex goals?
Do you mean intrinsic (top-level, static) goals, or instrumental ones (subgoals)? Bostrom in this chapter is concerned with the former, and there’s no particular reason those have to get complicated. You could certainly have a human-level intelligence that only inherently cared about eating food and having sex, though humans are not that kind of being.
Instrumental goals are indeed likely to get more complicated as agents become more intelligent and can devise more involved schemes to achieve their intrinsic values, but you also don’t really need to understand them in detail to make useful predictions about the consequences of an intelligence’s behavior.
I mean terminal, top-level (though not necessarily static) goals.
As to “no reason to get complicated”, how would you know? Note that I’m talking about a superintelligence, which is far beyond human level.
It’s a direct consequence of the orthogonality thesis. Bostrom (reasonably enough) supposes that there might be a limit in the opposite direction—to hold a goal you do need to be able to model it to some degree, so agent intelligence may set an upper bound on the complexity of goals the agent can hold—but there’s no corresponding reason for a limit in the opposite direction: Intelligent agents can understand simple goals just fine. I don’t have a problem reasoning about what a cow is trying to do, and I could certainly optimize towards the same had my mind been constructed to only want those things.
I don’t understand your reply.
How would you know that there’s no reason for terminal goals of a superintelligence “to get complicated” if humans, being “simple agents” in this context, are not sufficiently intelligent to consider highly complex goals?