JBlack comments on Visualise your own probability of an AI catastrophe: an interactive Sankey plot

JBlack 17 Feb 2023 2:50 UTC
1 point
−1
This presents the propositions as strictly conjunctive when they really are not. The first two arguably are conjunctive with the rest, but they’re also the propositions that few people argue against.
The third proposition “if we make them, they would be hard to align” is quite vague. What exactly does “hard to align” mean? It is made conjunctive with “if misaligned, …” and so would have to mean “of all such systems made in the future, at least one will be misaligned”. This is not at all a central meaning of the question actually asked!
The fourth proposition is arguably even worse. It takes a specific scenario “do a lot of damage by seeking power” and makes it conjunctive with “if they do a lot of damage, …”. There are many ways that a misaligned AI can do damage without “seeking power”. Even an aligned AI may cause a lot of damage, though I can see why we might want to ignore that possibility for AI safety purposes since it gets into the weeds of what is really meant by the word “aligned”.
Then the question compounds the error by using the construction “they … would”, which implies that “a typical” misaligned AI would. If there are 10000 misaligned AIs in the world and someone judges only a 1% chance that any given misaligned AI would cause a lot of damage, someone could justifiably answer “1%” to this question while the actual conditional probability of “a lot of damage” is essentially 100%.
The final question is also not conjunctive. A powerful AI might disempower humanity without doing any actual damage at all, and the same criticism about “typical” vs “any” applies.
Without a lot of serious rephrasing, I rate this tool (and the reasoning behind it) as vastly more misleading than helpful.