AGI is … something that approximates an expected utility maximizer.
This seems like a trait which AGIs might have, but not a part of how they should be defined. I think Eric would say that the first AI system which can carry out all the tasks we would expect an AGI to be capable of won’t actually approximate an expected utility maximiser, and I consider it an open empirical question whether or not he’s right.
Many risk-reducing services (especially ones that can address human safety problems) seem to require high-level general reasoning abilities, whereas many risk-increasing services can just be technical problem solvers or other kinds of narrow intelligences or optimizers, so CAIS is actually quite unsafe, and hard to make safe, whereas AGI / goal-directed agents are by default highly unsafe, but with appropriate advances in safety research can perhaps be made safe.
Yeah, good point. I guess that my last couple of sentences were pretty shallowly-analysed, and I’ll retract them and add a more measured conclusion.
This seems like a trait which AGIs might have, but not a part of how they should be defined.
There’s a thing that Eric is arguing against in his report, which he calls an “AGI agent”. I think it is reasonable to say that this thing can be fuzzily defined as something that approximates an expected utility maximizer.
(By your definition of AGI, which seems to be something like “thing that can do all tasks that humans can do”, CAIS would be AGI, and Eric is typically contrasting CAIS and AGI.)
That said, I disagree with Wei that this is relatively crisp: taken literally, the definition is vacuous because all behavior maximizes some expected utility. Maybe we mean that it is long-term goal-directed, but at least I don’t know how to cash that out. I think I agree that it is more crisp than the notion of a “service”, but it doesn’t feel that much more crisp.
That said, I disagree with Wei that this is relatively crisp: taken literally, the definition is vacuous because all behavior maximizes some expected utility.
I think I meant more that an AGI’s internal cognition resembles that of an expected utility maximizer. But even that isn’t quite right since it would cover AIs that only care about abstract worlds or short time horizons or don’t have general intelligence. So yeah, I definitely oversimplified there.
Maybe we mean that it is long-term goal-directed, but at least I don’t know how to cash that out.
What’s wrong with cashing that out as trying to direct/optimize the future according to some (maybe partial) preference ordering (and using a wide range of competencies, to cover “general”)? You said “In fact, I don’t want to assume that the agent even has a preference ordering” but I’m not sure why. Can you expand on that?
You said “In fact, I don’t want to assume that the agent even has a preference ordering” but I’m not sure why.
You could model a calculator as having a preference ordering, but that seems like a pretty useless model. Similarly, if you look at current policies that we get from RL, it seems like a relatively bad model to say that they have a preference ordering, especially a long-term one. It seems more accurate to say that they are executing a particular learned behavior that can’t be easily updated in the face of changing circumstances.
On the other hand, the (training process + resulting policy) together is more reasonably modeled as having a preference ordering.
While it’s true that so far the only model we have for getting generally intelligent behavior is to have a preference ordering (perhaps expressed as a reward function) that is then optimized, it doesn’t seem clear to me that any AI system we build must have this property. For example, GOFAI approaches do not seem like they are well-modeled as having a preference ordering, similarly with theorem proving.
(GOFAI and theorem proving are also examples of technologies that could plausibly have led to what-I-call-AGI-which-is-not-what-Eric-calls-an-AGI-agent, but whose internal cognition does not resemble that of an expected utility maximizer.)
Responding to this very late, but: If I recall correctly, Eric has told me in personal conversation that CAIS is a form of AGI, just not agent-like AGI. I suspect Eric would agree broadly with Richard’s definition.
I agree that the set of services is intended to, in aggregate, perform any task (that’s what the “Comprehensive” part of of “Comprehensive AI Services” means), and it shares that property with AGI (that’s what the “General” part of “Artificial General Intelligence” means).
There are other properties that Bostrom/Yudkowsky conceptions of AGI have that CAIS doesn’t have, including “searching across long-term plans to find one that achieves a potentially-unbounded goal, which involves deceiving or overpowering humans if they would otherwise try to interfere”.
I don’t particularly care what terminology we use; I just want us to note which properties a given system or set of systems does and does not have.
This seems like a trait which AGIs might have, but not a part of how they should be defined. I think Eric would say that the first AI system which can carry out all the tasks we would expect an AGI to be capable of won’t actually approximate an expected utility maximiser, and I consider it an open empirical question whether or not he’s right.
Yeah, good point. I guess that my last couple of sentences were pretty shallowly-analysed, and I’ll retract them and add a more measured conclusion.
There’s a thing that Eric is arguing against in his report, which he calls an “AGI agent”. I think it is reasonable to say that this thing can be fuzzily defined as something that approximates an expected utility maximizer.
(By your definition of AGI, which seems to be something like “thing that can do all tasks that humans can do”, CAIS would be AGI, and Eric is typically contrasting CAIS and AGI.)
That said, I disagree with Wei that this is relatively crisp: taken literally, the definition is vacuous because all behavior maximizes some expected utility. Maybe we mean that it is long-term goal-directed, but at least I don’t know how to cash that out. I think I agree that it is more crisp than the notion of a “service”, but it doesn’t feel that much more crisp.
I think I meant more that an AGI’s internal cognition resembles that of an expected utility maximizer. But even that isn’t quite right since it would cover AIs that only care about abstract worlds or short time horizons or don’t have general intelligence. So yeah, I definitely oversimplified there.
What’s wrong with cashing that out as trying to direct/optimize the future according to some (maybe partial) preference ordering (and using a wide range of competencies, to cover “general”)? You said “In fact, I don’t want to assume that the agent even has a preference ordering” but I’m not sure why. Can you expand on that?
You could model a calculator as having a preference ordering, but that seems like a pretty useless model. Similarly, if you look at current policies that we get from RL, it seems like a relatively bad model to say that they have a preference ordering, especially a long-term one. It seems more accurate to say that they are executing a particular learned behavior that can’t be easily updated in the face of changing circumstances.
On the other hand, the (training process + resulting policy) together is more reasonably modeled as having a preference ordering.
While it’s true that so far the only model we have for getting generally intelligent behavior is to have a preference ordering (perhaps expressed as a reward function) that is then optimized, it doesn’t seem clear to me that any AI system we build must have this property. For example, GOFAI approaches do not seem like they are well-modeled as having a preference ordering, similarly with theorem proving.
(GOFAI and theorem proving are also examples of technologies that could plausibly have led to what-I-call-AGI-which-is-not-what-Eric-calls-an-AGI-agent, but whose internal cognition does not resemble that of an expected utility maximizer.)
Responding to this very late, but: If I recall correctly, Eric has told me in personal conversation that CAIS is a form of AGI, just not agent-like AGI. I suspect Eric would agree broadly with Richard’s definition.
I agree that the set of services is intended to, in aggregate, perform any task (that’s what the “Comprehensive” part of of “Comprehensive AI Services” means), and it shares that property with AGI (that’s what the “General” part of “Artificial General Intelligence” means).
There are other properties that Bostrom/Yudkowsky conceptions of AGI have that CAIS doesn’t have, including “searching across long-term plans to find one that achieves a potentially-unbounded goal, which involves deceiving or overpowering humans if they would otherwise try to interfere”.
I don’t particularly care what terminology we use; I just want us to note which properties a given system or set of systems does and does not have.