Subtle point: the key question is not how certain we are, but how certain the predictor system (e.g. GPT) is. Presumably if it’s able to generalize that far out of distribution at all, it’s likely to have enough understanding to make a pretty high-confidence guess as to whether AGI will take over or not. We humans might not know the answer very confidently, but an AI capable enough to apply the human mimicry strategy usefully is more likely to know the answer very confidently, whatever that answer is.
Still, that is very bad news for us (assuming the predictor is even close to right.) I hope it doesn’t get as pessimistic as this, but still if something like that 1 in a million chance were right, than we’d need to take far more extreme actions or just give up on the AI safety project, as it’s effectively no longer tractable.
Subtle point: the key question is not how certain we are, but how certain the predictor system (e.g. GPT) is. Presumably if it’s able to generalize that far out of distribution at all, it’s likely to have enough understanding to make a pretty high-confidence guess as to whether AGI will take over or not. We humans might not know the answer very confidently, but an AI capable enough to apply the human mimicry strategy usefully is more likely to know the answer very confidently, whatever that answer is.
Still, that is very bad news for us (assuming the predictor is even close to right.) I hope it doesn’t get as pessimistic as this, but still if something like that 1 in a million chance were right, than we’d need to take far more extreme actions or just give up on the AI safety project, as it’s effectively no longer tractable.