This seems both inaccurate and highly controversial. (Controversially, this implies there is nothing that AI alignment can do—not only can we not make AI safer, we couldn’t even deliberately make AI more dangerous if we tried)
Accuracy-wise, you may not be able to know much about superintelligences, but even if you were to go with a uniform prior over outcomes, what that looks like depends tremendously on the sample space.
For instance, take the following argument: When transformative AI emerges, all bets are off, which means that any particular number of humans left alive should not be a privileged hypothesis. Thus, it makes sense to consider “number of humans alive after the singularity” to be a uniform distribution between 0 and N, where N is the number of humans in an intergalactic civilisation, so the chance of humanity being wiped out is almost zero.
If we want to use only binary hypotheses instead of numerical ones, I could instead say that each individual human has a 50⁄50 chance of survival, meaning that when you add these together, roughly half of humanity lives and again the chance of humanity being wiped out is basically zero.
This is not a good argument, but it isn’t obvious to me how its structure differs from your structure.
I see your point, and I agree that the prior/distribution matters. It always does. I guess my initial point is that a long-term prediction in a fast-moving “pre-paradigmatic” field is a fool’s errand. As for survival of the species vs a single individual, it is indeed hard to tell. One argument that can be made is that a Thanos-AI does not make a lot of sense. Major forces have major consequences, and whole species and ecosystems have been wiped out before, many times. One can also point out that there are long tails whenever there are lots of disparate variables, so there might be pockets of human or human-like survivors if there is a major calamity, so a full extinction is unlikely. It is really hard to tell long in advance what reference class the AI advances will be in. Maybe we should just call it Knightean uncertainty...
I agree that it is very difficult to make predictions about something that is A) Probably a long way away (Where “long” here is more than a few years) and B) Is likely to change things a great deal no matter what happens.
I think the correct solution to this problem of uncertainty is to reason normally about it but have very wide confidence intervals, rather than anchoring on 50% because X will happen or it won’t.
This seems both inaccurate and highly controversial. (Controversially, this implies there is nothing that AI alignment can do—not only can we not make AI safer, we couldn’t even deliberately make AI more dangerous if we tried)
Accuracy-wise, you may not be able to know much about superintelligences, but even if you were to go with a uniform prior over outcomes, what that looks like depends tremendously on the sample space.
For instance, take the following argument: When transformative AI emerges, all bets are off, which means that any particular number of humans left alive should not be a privileged hypothesis. Thus, it makes sense to consider “number of humans alive after the singularity” to be a uniform distribution between 0 and N, where N is the number of humans in an intergalactic civilisation, so the chance of humanity being wiped out is almost zero.
If we want to use only binary hypotheses instead of numerical ones, I could instead say that each individual human has a 50⁄50 chance of survival, meaning that when you add these together, roughly half of humanity lives and again the chance of humanity being wiped out is basically zero.
This is not a good argument, but it isn’t obvious to me how its structure differs from your structure.
I see your point, and I agree that the prior/distribution matters. It always does. I guess my initial point is that a long-term prediction in a fast-moving “pre-paradigmatic” field is a fool’s errand. As for survival of the species vs a single individual, it is indeed hard to tell. One argument that can be made is that a Thanos-AI does not make a lot of sense. Major forces have major consequences, and whole species and ecosystems have been wiped out before, many times. One can also point out that there are long tails whenever there are lots of disparate variables, so there might be pockets of human or human-like survivors if there is a major calamity, so a full extinction is unlikely. It is really hard to tell long in advance what reference class the AI advances will be in. Maybe we should just call it Knightean uncertainty...
I agree that it is very difficult to make predictions about something that is A) Probably a long way away (Where “long” here is more than a few years) and B) Is likely to change things a great deal no matter what happens.
I think the correct solution to this problem of uncertainty is to reason normally about it but have very wide confidence intervals, rather than anchoring on 50% because X will happen or it won’t.