I think that the first paragraph after the block quote is highly confused.
Your actions depend on your utility function, the actions you have available and the probabilities you assign to various outcomes, conditional on various actions. Lets look at a few examples. (Numbers contrived and made up.)
These examples are deliberately constructed to show that expected utility theory doesn’t blindly output “Work on AI risk” regardless of input. Other assumptions would favour working on AI risk.
You are totally selfish, and are old. The field of AI is moving slowly enough that it looks like not much will happen in your lifetime. You have a strong dislike of doing anything resembling AI safety work, and there isn’t much you could do. If you were utterly confidant AI wouldn’t come in your lifetime, you would have no reason to care. But, probabilities aren’t 0. So lets say you think there is a 1% chance of AI in your lifetime, and a 1 in a million chance that your efforts will make the difference between aligned and unaligned AI. U(Rest of life doing AI safety)=1. U(wiped out by killer AI)=0, U(Rest of life having fun)=2 and U(Living in FAI utopia)=10. Then the expected utility of having fun is 2*0.99+0.01*x*10 and the expected utility of AI safety work is 1*0.99+0.01*(x+0.000001)*10 where x is the chance of FAI. The latter expected utility is lower.
You are a perfect total utilitarian, and highly competent. You estimate that the difference between galactic utopia and extinction is so large that all other bits of utility are negligible in comparison. You estimate that if you work on Biotech safety, there is a 6% chance of AI doom, a 5% chance of bioweapon doom, and the remaining 89% chance of galactic utopia. You also estimate that if you work on AI safety there is a 5.9% chance of AI doom and a 20% chance of bioweapon doom, leaving only a 74.1% chance of galactic utopia. (You are really good at biosafety in particular) You choose to work on the biotech.
You are an average utilitarian, taking your utility function to be U=pleasure/(pleasure+suffering) over all minds you consider to be capable of such feelings. If a galactic utopia occurs, its size is huge enough to wash out everything that has happened on earth so far leaving a utility of basically 1. You thing there is a 0.1% chance of this happening. You think humans on average experience 2x as much pleasure as suffering, and farm animals on average experience 2x as much suffering as pleasure, and there are an equal number of each. Hence in the 99.9% case where AI wipes us out, the utility is exactly 0.5. However, you have a chance to reduce the number of farm animals to ever exist by 10%, leaving a utility of (2+0.9)/(2+0.9+ 1+1.8)=0.509. This increases your expected utility by 0.009. An opportunity to increase the chance of FAI galactic utopia from 0.1% to 1.1% is only worth 0.005, (a 1% chance of going from U=0.5 to U=1) Therefore reducing the number of farm animals to exist takes priority.
Thank you for those examples. I think this shows that the way I used a utility function but without placing it in a ‘real’ situation, i.e. not some locked-off situation without much in terms of viable alternative actions with some utility, is a fallacy.
I suppose then that I conflated the “What can I know?” with the “What must I do?”, separating a belief from an associated action (I think) resolves most of the conflicts that I saw.
I think that the first paragraph after the block quote is highly confused.
Your actions depend on your utility function, the actions you have available and the probabilities you assign to various outcomes, conditional on various actions. Lets look at a few examples. (Numbers contrived and made up.)
These examples are deliberately constructed to show that expected utility theory doesn’t blindly output “Work on AI risk” regardless of input. Other assumptions would favour working on AI risk.
You are totally selfish, and are old. The field of AI is moving slowly enough that it looks like not much will happen in your lifetime. You have a strong dislike of doing anything resembling AI safety work, and there isn’t much you could do. If you were utterly confidant AI wouldn’t come in your lifetime, you would have no reason to care. But, probabilities aren’t 0. So lets say you think there is a 1% chance of AI in your lifetime, and a 1 in a million chance that your efforts will make the difference between aligned and unaligned AI. U(Rest of life doing AI safety)=1. U(wiped out by killer AI)=0, U(Rest of life having fun)=2 and U(Living in FAI utopia)=10. Then the expected utility of having fun is 2*0.99+0.01*x*10 and the expected utility of AI safety work is 1*0.99+0.01*(x+0.000001)*10 where x is the chance of FAI. The latter expected utility is lower.
You are a perfect total utilitarian, and highly competent. You estimate that the difference between galactic utopia and extinction is so large that all other bits of utility are negligible in comparison. You estimate that if you work on Biotech safety, there is a 6% chance of AI doom, a 5% chance of bioweapon doom, and the remaining 89% chance of galactic utopia. You also estimate that if you work on AI safety there is a 5.9% chance of AI doom and a 20% chance of bioweapon doom, leaving only a 74.1% chance of galactic utopia. (You are really good at biosafety in particular) You choose to work on the biotech.
You are an average utilitarian, taking your utility function to be U=pleasure/(pleasure+suffering) over all minds you consider to be capable of such feelings. If a galactic utopia occurs, its size is huge enough to wash out everything that has happened on earth so far leaving a utility of basically 1. You thing there is a 0.1% chance of this happening. You think humans on average experience 2x as much pleasure as suffering, and farm animals on average experience 2x as much suffering as pleasure, and there are an equal number of each. Hence in the 99.9% case where AI wipes us out, the utility is exactly 0.5. However, you have a chance to reduce the number of farm animals to ever exist by 10%, leaving a utility of (2+0.9)/(2+0.9+ 1+1.8)=0.509. This increases your expected utility by 0.009. An opportunity to increase the chance of FAI galactic utopia from 0.1% to 1.1% is only worth 0.005, (a 1% chance of going from U=0.5 to U=1) Therefore reducing the number of farm animals to exist takes priority.
Thank you for those examples. I think this shows that the way I used a utility function but without placing it in a ‘real’ situation, i.e. not some locked-off situation without much in terms of viable alternative actions with some utility, is a fallacy.
I suppose then that I conflated the “What can I know?” with the “What must I do?”, separating a belief from an associated action (I think) resolves most of the conflicts that I saw.