Nora Belrose comments on Counting arguments provide no evidence for AI doom

Nora Belrose 28 Feb 2024 0:57 UTC
LW: 5 AF: 3
−10
AF
The point of that section is that “goals” are not ontologically fundamental entities with precise contents, in fact they could not possibly be so given a naturalistic worldview. So you don’t need to “target the inner search,” you just need to get the system to act the way you want in all the relevant scenarios.
The modern world is not a relevant scenario for evolution. “Evolution” did not need to, was not “intending to,” and could not have designed human brains so that they would do high inclusive genetic fitness stuff even when the environment wildly dramatically changes and culture becomes completely different from the ancestral environment.
- Wei Dai 28 Feb 2024 1:41 UTC
  19 points
  15
  Parent
  
  So you don’t need to “target the inner search,” you just need to get the system to act the way you want in all the relevant scenarios.
  
  Your original phrase was “all scenarios they are likely to encounter”, but now you’ve switched to “relevant scenarios”. Do you not acknowledge that these two phrases are semantically very different (or likely to be interpreted very differently by many readers), since the modern world is arguably a scenario that “they are likely to encounter” (given that they actually did encounter it) but you say “the modern world is not a relevant scenario for evolution”?
  
  Going forward, do you prefer to talk about “all scenarios they are likely to encounter”, or “relevant scenarios”, or both? If the latter, please clarify what you mean by “relevant”? (And please answer with respect to both evolution and AI alignment, in case the answer is different in the two cases. I’ll probably have more substantive things to say once we’ve cleared up the linguistic issues.)
  - Nora Belrose 28 Feb 2024 2:31 UTC
    2 points
    −9
    Parent
    No, I don’t think they are semantically very different. This seems like nitpicking. Obviously “they are likely to encounter” has to have some sort of time horizon attached to it, otherwise it would include times well past the heat death of the universe, or something.
    - Wei Dai 28 Feb 2024 3:04 UTC
      12 points
      6
      Parent
      It was not at all clear to me that you intended “they are likely to encounter” to have some sort of time horizon attached to it (as opposed to some other kind of restriction, or that you meant something pretty different from the literal meaning, or that your argument/idea itself was wrong), and it’s still not clear to me what sort of time horizon you have in mind.
      - David Johnston 29 Feb 2024 7:06 UTC
        5 points
        4
        Parent
        The AI system builders’ time horizon seems to be a reasonable starting point