Elias Schmied comments on Where I agree and disagree with Eliezer

Elias Schmied 29 Jun 2022 18:52 UTC
1 point
0
I must be missing something here. Isn’t optimizing necessary for superhuman behavior? So isn’t “superhuman behavior” a strictly stronger requirement than “being a mesaoptimizer”? So isn’t it clear which one happens first?
- paulfchristiano 29 Jun 2022 22:45 UTC
  7 points
  0
  Parent
  Fast imitations of subhuman behavior or imitations of augmented of humans are also superhuman. As is planning against a human-level imitation. And so on.
  It’s unclear if systems trained in that way will be imitating a process that optimizes, or will be optimizing in order to imitate. (Presumably they are doing both to varying degrees.) I don’t think this can be settled a priori.
  - Elias Schmied 30 Jun 2022 4:41 UTC
    4 points
    0
    Parent
    This “imitating an optimizer” / “optimizing to imitate” dichotomy seems unnecessarily confusing to me. Isn’t it just inner alignment / inner misalignment (with the human behavior you’re being trained on)? If you’re imitating an optimizer, you’re still an optimizer.
    - David Johnston 1 Jul 2022 0:00 UTC
      2 points
      0
      Parent
      I agree with this. If the key idea is, for example, optimising imitators generalise better than imitations of optimisers, or for a second example that they pursue simpler goals, it seems to me that it’d be better just to draw distinctions based on generalisation or goal simplicity and not on optimising imitators/imitations of optimisers.
    - Elias Schmied 30 Jun 2022 5:02 UTC
      1 point
      0
      Parent
      Sorry, I should be more specific. We are talking about AGI Safety, it seems unlikely that running narrow AI faster gets you AGI. I’m not sure if you disagree with that. I don’t understand what you mean by “imitations of augmented of humans” and “planning against a human-level imitation”.