This topic is poorly understood, very high confidence is obviously wrong for any claim that’s not exceptionally clear. Absence of doom is not such a claim, so the need to worry isn’t going anywhere.
This is why the post is so long: It has to integrate a lot of different sources of evidence, actually give lots of evidence for major claims, and I had to make sure that I actually have positive arguments such that it’s very, very likely we will align AI, an arguably make it safe by default. That’s why I made the argument on AIs as white boxes, and the fact that I think that the genome uses very weak priors to align us to having empathy for the ingroup, for example ridiculously well, because these were intended to be reasons to expect safe AI by default in a very strong sense.
Also, there is a lot of untapped evidence on humans, and that’s what I was using to make this post.
Quintin Pope and TurnTrout’s post is below on the massive evidence we have about humans for alignment.
Without sufficient clarity, which humanity doesn’t possess on this topic, no amount of somewhat confused arguments is sufficient for the kind of certainty that makes the remaining risk of extinction not worth worrying about. It’s important to understand and develop what arguments we have, but in their present state they are not suitable for arguing this particular case outside their own assumption-laden frames.
When reunited with unknown unknowns outside their natural frames, such arguments might plausibly make it reasonable to believe the risk of extinction is as low as 10%, or as high as 90%, but nothing more extreme than that. Nowhere across this whole range of epistemic possibilities is a situation that we “mostly don’t need to worry about”.
This topic is poorly understood, very high confidence is obviously wrong for any claim that’s not exceptionally clear. Absence of doom is not such a claim, so the need to worry isn’t going anywhere.
This is why the post is so long: It has to integrate a lot of different sources of evidence, actually give lots of evidence for major claims, and I had to make sure that I actually have positive arguments such that it’s very, very likely we will align AI, an arguably make it safe by default. That’s why I made the argument on AIs as white boxes, and the fact that I think that the genome uses very weak priors to align us to having empathy for the ingroup, for example ridiculously well, because these were intended to be reasons to expect safe AI by default in a very strong sense.
Also, there is a lot of untapped evidence on humans, and that’s what I was using to make this post.
Quintin Pope and TurnTrout’s post is below on the massive evidence we have about humans for alignment.
https://www.lesswrong.com/posts/CjFZeDD6iCnNubDoS/humans-provide-an-untapped-wealth-of-evidence-about
Without sufficient clarity, which humanity doesn’t possess on this topic, no amount of somewhat confused arguments is sufficient for the kind of certainty that makes the remaining risk of extinction not worth worrying about. It’s important to understand and develop what arguments we have, but in their present state they are not suitable for arguing this particular case outside their own assumption-laden frames.
When reunited with unknown unknowns outside their natural frames, such arguments might plausibly make it reasonable to believe the risk of extinction is as low as 10%, or as high as 90%, but nothing more extreme than that. Nowhere across this whole range of epistemic possibilities is a situation that we “mostly don’t need to worry about”.