cousin_it comments on Considerations on interaction between AI and expected value of the future

cousin_it 9 Dec 2021 9:06 UTC
LW: 3 AF: 1
AF
These involve extinction, so they don’t answer the question what’s the most likely outcome conditional on non-extinction. I think the answer there is a specific kind of near-miss at alignment which is quite scary.
- Vanessa Kosoy 9 Dec 2021 9:29 UTC
  LW: 8 AF: 4
  AF Parent
  My point is that Pr[non-extinction | misalignment] << 1, Pr[non-extinction | alignment] = 1, Pr[alignment] is not that low and therefore Pr[misalignment | non-extinction] is low, by Bayes.
  - cousin_it 9 Dec 2021 23:16 UTC
    LW: 2 AF: 1
    AF Parent
    To me it feels like alignment is a tiny target to hit, and around it there’s a neighborhood of almost-alignment, where enough is achieved to keep people alive but locked out of some important aspect of human value. There are many aspects such that missing even one or two of them is enough to make life bad (complexity and fragility of value). You seem to be saying that if we achieve enough alignment to keep people alive, we have >50% chance of achieving all/most other aspects of human value as well, but I don’t see why that’s true.
  - JBlack 11 Dec 2021 4:51 UTC
    1 point
    Parent
    I think where we differ is that I think Pr[full alignment] is extremely low, and there is quite a lot of space for non-omnicidal partial misalignment.