I think there is a pretty real tradeoff you are pointing out, though I personally wouldn’t put that much weight on AI control accelerating AI capabilities speed as a negative factor, primarily because at least one actor in the AI race will by default scale capabilities approximately as fast as is feasible (I’m talking about OpenAI here), so methods to make AI more controllable will produce pretty much strict safety improvements from existential catastrophes that rely on AI control having gone awry.
I’m also not so confident in control/alignment measures working out by default that I think AI alignment/control work progressing is negative, though I do think it might soon not be the best approach to keeping humanity safe.
However, I think this post does address a pretty real tradeoff that I suspect will soon be plausibly fairly tight: There is a tension between making AI more controllable and making AI not be abusable by very bad humans, and even more importantly, making alignment work go better also increases the ability of dictators to do things, and even more worryingly increases s-risks.
Do not mistake me for endorsing Andrew Sauer’s solution here, because I don’t, but there’s a very clear reason for expecting plausibly large amounts of people to suffer horrifyingly under an AI future, and that’s because the technology to invent mind uploading, for one example, combined with lots of humans genuinely having a hated outgroup that they want to abuse really badly means that large-scale suffering can occur cheaply.
And in a world where basically all humans have 0 economic value, or even negative economic value, there’s no force pushing back against torturing a large portion of your citizenry.
I also like the book Avoiding The Worst to understand why S-risk is an issue that could be a very big problem.
I don’t agree with the conclusion that alignment and safety research should be kept private, since I do think it’s still positive in expectation for people to have more control over AI systems, but I agree with the point of the post that there is a real tradeoff involved here.
I think there is a pretty real tradeoff you are pointing out, though I personally wouldn’t put that much weight on AI control accelerating AI capabilities speed as a negative factor, primarily because at least one actor in the AI race will by default scale capabilities approximately as fast as is feasible (I’m talking about OpenAI here), so methods to make AI more controllable will produce pretty much strict safety improvements from existential catastrophes that rely on AI control having gone awry.
I’m also not so confident in control/alignment measures working out by default that I think AI alignment/control work progressing is negative, though I do think it might soon not be the best approach to keeping humanity safe.
However, I think this post does address a pretty real tradeoff that I suspect will soon be plausibly fairly tight: There is a tension between making AI more controllable and making AI not be abusable by very bad humans, and even more importantly, making alignment work go better also increases the ability of dictators to do things, and even more worryingly increases s-risks.
Do not mistake me for endorsing Andrew Sauer’s solution here, because I don’t, but there’s a very clear reason for expecting plausibly large amounts of people to suffer horrifyingly under an AI future, and that’s because the technology to invent mind uploading, for one example, combined with lots of humans genuinely having a hated outgroup that they want to abuse really badly means that large-scale suffering can occur cheaply.
And in a world where basically all humans have 0 economic value, or even negative economic value, there’s no force pushing back against torturing a large portion of your citizenry.
I also like the book Avoiding The Worst to understand why S-risk is an issue that could be a very big problem.
See links below:
https://www.lesswrong.com/posts/CtXaFo3hikGMWW4C9/the-case-against-ai-alignment
https://www.amazon.com/dp/B0BK59W7ZW
https://centerforreducingsuffering.org/wp-content/uploads/2022/10/Avoiding_The_Worst_final.pdf
I don’t agree with the conclusion that alignment and safety research should be kept private, since I do think it’s still positive in expectation for people to have more control over AI systems, but I agree with the point of the post that there is a real tradeoff involved here.