Noosphere89 comments on Human Mimicry Mainly Works When We’re Already Close

Noosphere89 17 Aug 2022 23:34 UTC
5 points
1
I will say that your arguments against human mimicry also imply that extreme actions are likely necessary to prevent misaligned AI. To put it another way, the reason the AI is simulating that world is because if we are truly doomed by having a 1 in a million chance of alignment and AGI will be built anyways, pivotal acts or extreme circumstances are required to make the future go well at all, and in most cases you die anyways despite your attempt. Now don’t get me wrong, this could happen, but right now that level of certainty isn’t warranted.

So if I can make a takeaway from this post, it’s that you think that our situation is so bad that nothing short of mathematical proof of alignment or extreme pivotal acts is sufficient. Is that right? And if so, why do you think that we have such low chances of making alignment work?
- johnswentworth 18 Aug 2022 3:19 UTC
  5 points
  0
  Parent
  Subtle point: the key question is not how certain we are, but how certain the predictor system (e.g. GPT) is. Presumably if it’s able to generalize that far out of distribution at all, it’s likely to have enough understanding to make a pretty high-confidence guess as to whether AGI will take over or not. We humans might not know the answer very confidently, but an AI capable enough to apply the human mimicry strategy usefully is more likely to know the answer very confidently, whatever that answer is.
  - Noosphere89 18 Aug 2022 13:30 UTC
    3 points
    0
    Parent
    Still, that is very bad news for us (assuming the predictor is even close to right.) I hope it doesn’t get as pessimistic as this, but still if something like that 1 in a million chance were right, than we’d need to take far more extreme actions or just give up on the AI safety project, as it’s effectively no longer tractable.