I would assume it’s most impactful to focus on the marginal future where we survive, rather than the median? ie, the futures where humanity barely solves alignment in time, or has a dramatic close-call with AI disaster, or almost fails to build the international agreement needed to suppress certain dangerous technologies, or etc.
IMO, the marginal futures where humanity survives, are the scenarios where our actions have the most impact—in futures that are totally doomed, it’s worthless to try anything, and in other futures that go absurdly well it’s similarly unimportant to contribute our own efforts. Just in the same way that our votes are more impactful when we vote in a very close election, our actions to advance AI alignment are most impactful in the scenarios balanced on a knife’s edge between survival and disaster.
(I think that is the right logic for your altruistic, AI safety research efforts anyways. If you are making personal plans, like deciding whether to have children or how much to save for retirement, that’s a different case with different logic to it.)
I agree that this is accurate but worry that it doesn’t help the sort of person who wants just one future to put more weight on. What futures count as marginal depend on the strategy you’re considering, and on what actions you expect other people to take—you can’t just find some concrete future that is “the marginal future,” and only take actions that affect that one future.
If you want to avoid the computational burden of consequentialism, rather than focusing on just one future I think a solid recommendation is the virtue-ethical death with dignity strategy.
I would assume it’s most impactful to focus on the marginal future where we survive, rather than the median? ie, the futures where humanity barely solves alignment in time, or has a dramatic close-call with AI disaster, or almost fails to build the international agreement needed to suppress certain dangerous technologies, or etc.
IMO, the marginal futures where humanity survives, are the scenarios where our actions have the most impact—in futures that are totally doomed, it’s worthless to try anything, and in other futures that go absurdly well it’s similarly unimportant to contribute our own efforts. Just in the same way that our votes are more impactful when we vote in a very close election, our actions to advance AI alignment are most impactful in the scenarios balanced on a knife’s edge between survival and disaster.
(I think that is the right logic for your altruistic, AI safety research efforts anyways. If you are making personal plans, like deciding whether to have children or how much to save for retirement, that’s a different case with different logic to it.)
I agree that this is accurate but worry that it doesn’t help the sort of person who wants just one future to put more weight on. What futures count as marginal depend on the strategy you’re considering, and on what actions you expect other people to take—you can’t just find some concrete future that is “the marginal future,” and only take actions that affect that one future.
If you want to avoid the computational burden of consequentialism, rather than focusing on just one future I think a solid recommendation is the virtue-ethical death with dignity strategy.