AI capabilities are currently subhuman in some areas (driving cars), about human in some areas (Bar exam), and superhuman in some areas (playing chess)
Capabilities scale with compute
The doubling time for AI compute is ~6 months
In 5 years compute will scale 2^(5÷0.5)=1024 times
In 5 years, with ~1024 times the compute, AI will be superhuman at most tasks including designing AI
Designing a better version of itself will increase an AI’s reward function
An AI will design a better version of itself and recursively loop this process until it reaches some limit
Such any AI will be superhuman at almost all tasks, including computer security, R&D, planning, and persuasion
The AI will deploy these skills to increase its reward function
Human survival is not in the AIs reward function
The AI will kill of most or all humans to prevent the humans from possibly decreasing its reward function
Therefore: p(Doom) is high within 5 years
Despite what the title says this is not a perfect argument tree. Which part do you think is the most flawed?
Edit: As per request the title has been changed from the humourous “An utterly perfect argument about p(Doom)” to “What are the flaws in this argument about p(Doom)?”
Edit2: yah Frontpage! Totally for the wrong reasons though
Edit3: added ”, with ~1024 times the compute,” to “In 5 years AI will be superhuman at most tasks including designing AI”
What are the flaws in this argument about p(Doom)?
Technical alignment is hard
Technical alignment will take 5+ years
AI capabilities are currently subhuman in some areas (driving cars), about human in some areas (Bar exam), and superhuman in some areas (playing chess)
Capabilities scale with compute
The doubling time for AI compute is ~6 months
In 5 years compute will scale 2^(5÷0.5)=1024 times
In 5 years, with ~1024 times the compute, AI will be superhuman at most tasks including designing AI
Designing a better version of itself will increase an AI’s reward function
An AI will design a better version of itself and recursively loop this process until it reaches some limit
Such any AI will be superhuman at almost all tasks, including computer security, R&D, planning, and persuasion
The AI will deploy these skills to increase its reward function
Human survival is not in the AIs reward function
The AI will kill of most or all humans to prevent the humans from possibly decreasing its reward function
Therefore: p(Doom) is high within 5 years
Despite what the title says this is not a perfect argument tree. Which part do you think is the most flawed?
Edit: As per request the title has been changed from the humourous “An utterly perfect argument about p(Doom)” to “What are the flaws in this argument about p(Doom)?”
Edit2: yah Frontpage! Totally for the wrong reasons though
Edit3: added ”, with ~1024 times the compute,” to “In 5 years AI will be superhuman at most tasks including designing AI”