Thank you. If I understand your explanation correctly, you are saying that there are alignment solutions that are rooted in more general avoidance of harm to currently living humans. If these turn out to be the only feasible solutions to the not-killing-all-humans problem, then they will produce not-killing-most-humans as a side-effect. Nuke analogy: if we cannot build/test a bomb without igniting the whole atmosphere, we’ll pass on bombs altogether and stick to peaceful nuclear energy generation.
It seems clear that such limiting approaches would be avoided by rational actors under winner-take-all dynamics, so long as other approaches remain that have not yet been falsified.
Follow-up Question: does the “any meaningfully useful AI is also potentially lethal to its operator” assertion hold under the significantly different usefulness requirements of a much smaller human population? I’m imagining limited AI that can only just “get the (hard) job done” of killing most people under the direction of its operators, and then support a “good enough” future for the remaining population, which isn’t the hard part because the Earth itself is pretty good at supporting small human populations.
Thank you. If I understand your explanation correctly, you are saying that there are alignment solutions that are rooted in more general avoidance of harm to currently living humans. If these turn out to be the only feasible solutions to the not-killing-all-humans problem, then they will produce not-killing-most-humans as a side-effect. Nuke analogy: if we cannot build/test a bomb without igniting the whole atmosphere, we’ll pass on bombs altogether and stick to peaceful nuclear energy generation.
It seems clear that such limiting approaches would be avoided by rational actors under winner-take-all dynamics, so long as other approaches remain that have not yet been falsified.
Follow-up Question: does the “any meaningfully useful AI is also potentially lethal to its operator” assertion hold under the significantly different usefulness requirements of a much smaller human population? I’m imagining limited AI that can only just “get the (hard) job done” of killing most people under the direction of its operators, and then support a “good enough” future for the remaining population, which isn’t the hard part because the Earth itself is pretty good at supporting small human populations.