johnswentworth comments on Human Mimicry Mainly Works When We’re Already Close

johnswentworth 18 Aug 2022 3:19 UTC
5 points
0
Subtle point: the key question is not how certain we are, but how certain the predictor system (e.g. GPT) is. Presumably if it’s able to generalize that far out of distribution at all, it’s likely to have enough understanding to make a pretty high-confidence guess as to whether AGI will take over or not. We humans might not know the answer very confidently, but an AI capable enough to apply the human mimicry strategy usefully is more likely to know the answer very confidently, whatever that answer is.
- Noosphere89 18 Aug 2022 13:30 UTC
  3 points
  0
  Parent
  Still, that is very bad news for us (assuming the predictor is even close to right.) I hope it doesn’t get as pessimistic as this, but still if something like that 1 in a million chance were right, than we’d need to take far more extreme actions or just give up on the AI safety project, as it’s effectively no longer tractable.