Wei Dai comments on Comments on CAIS

Wei Dai Jan 12, 2019, 8:05 PM
LW: 33 AF: 11
AF

However, I worry that the fuzziness of the usual concept of AGI has now been replaced by a fuzzy notion of “service” which makes sense in our current context, but may not in the context of much more powerful AI technology.

It seems to me that “AGI” is actually relatively crisp compared to “service”: it’s something that approximates an expected utility maximizer, which seems like a pretty small and relatively compact cluster in thing-space. “Service” seems to cover a lot more varied ground, from early simple things like image classifiers to later strategic planners, natural language advice givers, AI researchers, etc., with the later things shading into AGI in a way makes it hard to distinguish between them.

it definitely seems worth investigating ways to make modular and bounded AIs more competitive, and CAIS more likely.

A major problem in predicting CAIS safety is to understand the order in which various services are likely to arise, in particular whether risk-reducing services are likely to come before risk-increasing services. This seems to require a lot of work in delineating various kinds of services and how they depend on each other as well as on algorithmic advancements, conceptual insights, computing power, etc. (instead of treating them as largely interchangeable or thinking that safety-relevant services will be there when we need them). Since this analysis seems very hard to do much ahead of time, I think we’ll have to put very wide error bars on any predictions of whether CAIS would be safe or unsafe, until very late in the game. (This seems like a natural perspective for thinking about CAIS safety, which appears to be missing from Eric’s report.)

Having said that, my feeling is that many risk-reducing services (especially ones that can address human safety problems) seem to require high-level general reasoning abilities, whereas many risk-increasing services can just be technical problem solvers or other kinds of narrow intelligences or optimizers, so the latter is likely to arrive earlier than the former, and as a result CAIS is actually quite unsafe, and hard to make safe, whereas AGI is by default highly unsafe, but with appropriate advances in safety research can perhaps be made safe. So I disagree with the proposal to push for CAIS, at least until we can better understand the strategic landscape. See also this comment where I made some related points.
What links here?
- What are CAIS’ boldest near/medium-term predictions? by Bird Concept (Mar 28, 2019, 1:14 PM; 31 points)
- Richard_Ngo Jan 12, 2019, 8:30 PM
  LW: 8 AF: 4
  AF Parent
  AGI is … something that approximates an expected utility maximizer.
  This seems like a trait which AGIs might have, but not a part of how they should be defined. I think Eric would say that the first AI system which can carry out all the tasks we would expect an AGI to be capable of won’t actually approximate an expected utility maximiser, and I consider it an open empirical question whether or not he’s right.
  Many risk-reducing services (especially ones that can address human safety problems) seem to require high-level general reasoning abilities, whereas many risk-increasing services can just be technical problem solvers or other kinds of narrow intelligences or optimizers, so CAIS is actually quite unsafe, and hard to make safe, whereas AGI / goal-directed agents are by default highly unsafe, but with appropriate advances in safety research can perhaps be made safe.
  Yeah, good point. I guess that my last couple of sentences were pretty shallowly-analysed, and I’ll retract them and add a more measured conclusion.
  - Rohin Shah Jan 13, 2019, 6:44 PM
    LW: 8 AF: 3
    AF Parent
    This seems like a trait which AGIs might have, but not a part of how they should be defined.
    There’s a thing that Eric is arguing against in his report, which he calls an “AGI agent”. I think it is reasonable to say that this thing can be fuzzily defined as something that approximates an expected utility maximizer.
    (By your definition of AGI, which seems to be something like “thing that can do all tasks that humans can do”, CAIS would be AGI, and Eric is typically contrasting CAIS and AGI.)
    That said, I disagree with Wei that this is relatively crisp: taken literally, the definition is vacuous because all behavior maximizes some expected utility. Maybe we mean that it is long-term goal-directed, but at least I don’t know how to cash that out. I think I agree that it is more crisp than the notion of a “service”, but it doesn’t feel that much more crisp.
    - Wei Dai Jan 13, 2019, 7:26 PM
      4 points
      Parent
      
      That said, I disagree with Wei that this is relatively crisp: taken literally, the definition is vacuous because all behavior maximizes some expected utility.
      
      I think I meant more that an AGI’s internal cognition resembles that of an expected utility maximizer. But even that isn’t quite right since it would cover AIs that only care about abstract worlds or short time horizons or don’t have general intelligence. So yeah, I definitely oversimplified there.
      
      Maybe we mean that it is long-term goal-directed, but at least I don’t know how to cash that out.
      
      What’s wrong with cashing that out as trying to direct/optimize the future according to some (maybe partial) preference ordering (and using a wide range of competencies, to cover “general”)? You said “In fact, I don’t want to assume that the agent even has a preference ordering” but I’m not sure why. Can you expand on that?
      - Rohin Shah Jan 17, 2019, 6:28 PM
        4 points
        Parent
        You said “In fact, I don’t want to assume that the agent even has a preference ordering” but I’m not sure why.
        You could model a calculator as having a preference ordering, but that seems like a pretty useless model. Similarly, if you look at current policies that we get from RL, it seems like a relatively bad model to say that they have a preference ordering, especially a long-term one. It seems more accurate to say that they are executing a particular learned behavior that can’t be easily updated in the face of changing circumstances.
        On the other hand, the (training process + resulting policy) together is more reasonably modeled as having a preference ordering.
        While it’s true that so far the only model we have for getting generally intelligent behavior is to have a preference ordering (perhaps expressed as a reward function) that is then optimized, it doesn’t seem clear to me that any AI system we build must have this property. For example, GOFAI approaches do not seem like they are well-modeled as having a preference ordering, similarly with theorem proving.
        (GOFAI and theorem proving are also examples of technologies that could plausibly have led to what-I-call-AGI-which-is-not-what-Eric-calls-an-AGI-agent, but whose internal cognition does not resemble that of an expected utility maximizer.)
    - Chris van Merwijk May 2, 2022, 12:34 PM
      LW: 1 AF: 1
      AF Parent
      Responding to this very late, but: If I recall correctly, Eric has told me in personal conversation that CAIS is a form of AGI, just not agent-like AGI. I suspect Eric would agree broadly with Richard’s definition.
      - Rohin Shah May 2, 2022, 3:18 PM
        2 points
        Parent
        I agree that the set of services is intended to, in aggregate, perform any task (that’s what the “Comprehensive” part of of “Comprehensive AI Services” means), and it shares that property with AGI (that’s what the “General” part of “Artificial General Intelligence” means).
        There are other properties that Bostrom/Yudkowsky conceptions of AGI have that CAIS doesn’t have, including “searching across long-term plans to find one that achieves a potentially-unbounded goal, which involves deceiving or overpowering humans if they would otherwise try to interfere”.
        I don’t particularly care what terminology we use; I just want us to note which properties a given system or set of systems does and does not have.