A lot of the AI risk arguments seem to come mixed together with assumptions about a particular type of utilitarianism, and with a very particular transhumanist aesthetic about the future (nanotech, von Neumann probes, Dyson spheres, tiling the universe with matter in fixed configurations, simulated minds, etc.).
I find these things (especially the transhumanist stuff) to not be very convincing relative to the confidence people seem to express about them, but they also don’t seem to be essential to the problem of AI risk. Is there a minimal version of the AI risk arguments that are disentangled from these things?
There’s this, which doesn’t seem to depend on utilitarian or transhumanist arguments:
Ajeya Cotra’s Biological Anchors report estimates a 10% chance of transformative AI by 2031, and a 50% chance by 2052. Others (eg Eliezer Yudkowsky) think it might happen even sooner.
Let me rephrase this in a deliberately inflammatory way: if you’re under ~50, unaligned AI might kill you and everyone you know. Not your great-great-(...)-great-grandchildren in the year 30,000 AD. Not even [just] your children. You and everyone you know.
Is there a minimal version of the AI risk arguments that are disentangled from these things?
Yes. I’m one of those transhumanist people, but you can talk about AI risk completely adjacent from that. Tryna write up something that compiles the other arguments.
I’d say AI ruin only relies on consequentialism. What consequentialism means is that you have a utility function, and you’re trying to maximize the expected value of your utility function. There are theorems to the effect that if you don’t behave as though you are maximizing the expected value of some particular utility function, then you are being stupid in some way. Utilitarianism is a particular case of consequentialism where your utility function is equal to the average happiness of everyone in the world. “The greatest good for the greatest number.” Utilitarianism is not relevant to AI ruin because without solving alignment first, the AI is not going to care about “goodness”.
The von Neumann probes aren’t important to the AI ruin picture either: Humanity would be doomed, probes or no probes. The probes are just a grim reminder that screwing up AI won’t only kill all humans, it will also kill all the aliens unlucky enough to be living too close to us.
A lot of the AI risk arguments seem to come mixed together with assumptions about a particular type of utilitarianism, and with a very particular transhumanist aesthetic about the future (nanotech, von Neumann probes, Dyson spheres, tiling the universe with matter in fixed configurations, simulated minds, etc.).
I find these things (especially the transhumanist stuff) to not be very convincing relative to the confidence people seem to express about them, but they also don’t seem to be essential to the problem of AI risk. Is there a minimal version of the AI risk arguments that are disentangled from these things?
There’s this, which doesn’t seem to depend on utilitarian or transhumanist arguments:
Yes. I’m one of those transhumanist people, but you can talk about AI risk completely adjacent from that. Tryna write up something that compiles the other arguments.
I’d say AI ruin only relies on consequentialism. What consequentialism means is that you have a utility function, and you’re trying to maximize the expected value of your utility function. There are theorems to the effect that if you don’t behave as though you are maximizing the expected value of some particular utility function, then you are being stupid in some way. Utilitarianism is a particular case of consequentialism where your utility function is equal to the average happiness of everyone in the world. “The greatest good for the greatest number.” Utilitarianism is not relevant to AI ruin because without solving alignment first, the AI is not going to care about “goodness”.
The von Neumann probes aren’t important to the AI ruin picture either: Humanity would be doomed, probes or no probes. The probes are just a grim reminder that screwing up AI won’t only kill all humans, it will also kill all the aliens unlucky enough to be living too close to us.
I ended up writing a short story about this, which involves no nanotech. :-)
https://www.lesswrong.com/posts/LtdbPZxLuYktYhveL/a-plausible-story-about-ai-risk