The issue is the payoffs involved. Even if it’s say at 50% risk, it’s still individually rational to take the plunge, because the other 50% in expected value terms outweighs everything else. I don’t believe this for a multitude of reasons, but it’s useful to illustrate.
The payoffs are essentially cooperate and reduce X-risk from say 50% to 1%, which gives them a utility of say 50-200, or defect and gain expected utility of say 10^20 or more if we grant the assumption on LW that AI is the most important invention in human history.
Meanwhile for others, cooperation has the utility of individual defection in this scenario, which is 10^20+ utility, whereas defection essentially reverses the sign of utility gained, which is −10^20+ utility.
The problem is that without a way to enforce cooperation, it’s too easy to defect until everyone dies.
Now thankfully, I believe that existential risk is a lot lower, but if existential risk were high in my model, then we eventually need to start enforcing cooperation, as the incentives would be dangerous if existential risk is high.
I’m going to naively express something that your risk calculation makes me think:
I think EY and I and others who are persuaded by him seem to be rating the expected utility of a x-risk outcome as nothing less (more?) than negative infinity. I.e., whether the risk is 1% or 50% our expected utility from AI x-risk will calculate to approx. negative infinity, which will outweigh even 99% of 10^20+ utility.
This is why shutting it down seems to be the only logical move in this calculation right now. Because if you think that a negative infinity outcome exists at all in the outcome space, then the only solution is to avoid the outcome space completely until you can be assured that it does not include a potentially-negative infinity outcome. It’s not about getting that negative infinity outcome to some tiny expected percentage, it’s about eliminating it from the outcome space entirely.
The problem is that the key actor is of course OpenAI, not Eliezer, so what Eliezer values on X-risk is not relevant to the analysis. What matters is how much the people at AI companies value them dying, and given that that I believe they don’t value their lives infinitely, then Eliezer’s calculations don’t matter, since he isn’t a relevant actor in a AI company.
The issue is the payoffs involved. Even if it’s say at 50% risk, it’s still individually rational to take the plunge, because the other 50% in expected value terms outweighs everything else. I don’t believe this for a multitude of reasons, but it’s useful to illustrate.
The payoffs are essentially cooperate and reduce X-risk from say 50% to 1%, which gives them a utility of say 50-200, or defect and gain expected utility of say 10^20 or more if we grant the assumption on LW that AI is the most important invention in human history.
Meanwhile for others, cooperation has the utility of individual defection in this scenario, which is 10^20+ utility, whereas defection essentially reverses the sign of utility gained, which is −10^20+ utility.
The problem is that without a way to enforce cooperation, it’s too easy to defect until everyone dies.
Now thankfully, I believe that existential risk is a lot lower, but if existential risk were high in my model, then we eventually need to start enforcing cooperation, as the incentives would be dangerous if existential risk is high.
I don’t believe that, thankfully.
I’m going to naively express something that your risk calculation makes me think:
I think EY and I and others who are persuaded by him seem to be rating the expected utility of a x-risk outcome as nothing less (more?) than negative infinity. I.e., whether the risk is 1% or 50% our expected utility from AI x-risk will calculate to approx. negative infinity, which will outweigh even 99% of 10^20+ utility.
This is why shutting it down seems to be the only logical move in this calculation right now. Because if you think that a negative infinity outcome exists at all in the outcome space, then the only solution is to avoid the outcome space completely until you can be assured that it does not include a potentially-negative infinity outcome. It’s not about getting that negative infinity outcome to some tiny expected percentage, it’s about eliminating it from the outcome space entirely.
The problem is that the key actor is of course OpenAI, not Eliezer, so what Eliezer values on X-risk is not relevant to the analysis. What matters is how much the people at AI companies value them dying, and given that that I believe they don’t value their lives infinitely, then Eliezer’s calculations don’t matter, since he isn’t a relevant actor in a AI company.