Rohin Shah comments on Comments on CAIS

Rohin Shah Jan 13, 2019, 6:44 PM
LW: 8 AF: 3
AF
This seems like a trait which AGIs might have, but not a part of how they should be defined.
There’s a thing that Eric is arguing against in his report, which he calls an “AGI agent”. I think it is reasonable to say that this thing can be fuzzily defined as something that approximates an expected utility maximizer.
(By your definition of AGI, which seems to be something like “thing that can do all tasks that humans can do”, CAIS would be AGI, and Eric is typically contrasting CAIS and AGI.)
That said, I disagree with Wei that this is relatively crisp: taken literally, the definition is vacuous because all behavior maximizes some expected utility. Maybe we mean that it is long-term goal-directed, but at least I don’t know how to cash that out. I think I agree that it is more crisp than the notion of a “service”, but it doesn’t feel that much more crisp.
- Wei Dai Jan 13, 2019, 7:26 PM
  4 points
  Parent
  
  That said, I disagree with Wei that this is relatively crisp: taken literally, the definition is vacuous because all behavior maximizes some expected utility.
  
  I think I meant more that an AGI’s internal cognition resembles that of an expected utility maximizer. But even that isn’t quite right since it would cover AIs that only care about abstract worlds or short time horizons or don’t have general intelligence. So yeah, I definitely oversimplified there.
  
  Maybe we mean that it is long-term goal-directed, but at least I don’t know how to cash that out.
  
  What’s wrong with cashing that out as trying to direct/optimize the future according to some (maybe partial) preference ordering (and using a wide range of competencies, to cover “general”)? You said “In fact, I don’t want to assume that the agent even has a preference ordering” but I’m not sure why. Can you expand on that?
  - Rohin Shah Jan 17, 2019, 6:28 PM
    4 points
    Parent
    You said “In fact, I don’t want to assume that the agent even has a preference ordering” but I’m not sure why.
    You could model a calculator as having a preference ordering, but that seems like a pretty useless model. Similarly, if you look at current policies that we get from RL, it seems like a relatively bad model to say that they have a preference ordering, especially a long-term one. It seems more accurate to say that they are executing a particular learned behavior that can’t be easily updated in the face of changing circumstances.
    On the other hand, the (training process + resulting policy) together is more reasonably modeled as having a preference ordering.
    While it’s true that so far the only model we have for getting generally intelligent behavior is to have a preference ordering (perhaps expressed as a reward function) that is then optimized, it doesn’t seem clear to me that any AI system we build must have this property. For example, GOFAI approaches do not seem like they are well-modeled as having a preference ordering, similarly with theorem proving.
    (GOFAI and theorem proving are also examples of technologies that could plausibly have led to what-I-call-AGI-which-is-not-what-Eric-calls-an-AGI-agent, but whose internal cognition does not resemble that of an expected utility maximizer.)
- Chris van Merwijk May 2, 2022, 12:34 PM
  LW: 1 AF: 1
  AF Parent
  Responding to this very late, but: If I recall correctly, Eric has told me in personal conversation that CAIS is a form of AGI, just not agent-like AGI. I suspect Eric would agree broadly with Richard’s definition.
  - Rohin Shah May 2, 2022, 3:18 PM
    2 points
    Parent
    I agree that the set of services is intended to, in aggregate, perform any task (that’s what the “Comprehensive” part of of “Comprehensive AI Services” means), and it shares that property with AGI (that’s what the “General” part of “Artificial General Intelligence” means).
    There are other properties that Bostrom/Yudkowsky conceptions of AGI have that CAIS doesn’t have, including “searching across long-term plans to find one that achieves a potentially-unbounded goal, which involves deceiving or overpowering humans if they would otherwise try to interfere”.
    I don’t particularly care what terminology we use; I just want us to note which properties a given system or set of systems does and does not have.