JanB answers How much to optimize for the short-timelines scenario?

JanB 21 Jul 2022 12:30 UTC
5 points
1
I’m super interested in this question as well. Here are two thoughts:
1. It’s not enough to look at the expected “future size of the AI alignment community”, you need to look at the full distribution.
Let’s say timelines are long. We can assume that the benefits of alignment work scale roughly logarithmically with the resources invested. The derivative of log is 1/x, and that’s how the value of a marginal contribution scales.

There is some probability, let’s say 50%, that the world starts dedicating many resources to AI risk and the number of people working on alignment is massive, let’s say 10000x today. In these cases, your contribution would be roughly zero. But there is some probability (let’s say 50%) that the world keeps being bad at preparing for potentially catastrophic events, the AI alignment community is not much larger than today. In total, you’d only discount your contribution by 50% (compared to short timelines).

This is just for illustration, and I made many implicit assumptions, like: the timing of the work doesn’t matter as long as it’s before AGI, early work does not influence the amount of future work, “the size of the alignment community at crunchtime” is identical to “future work done”, and so on...
1. It matters a lot how much better “work during crunchtime” is vs “work before crunchtime”.
Let’s say timelines are long, with AGI happening in 60 years. It’s totally conceivable that the world keeps being bad at preparing for potentially catastrophic events, and the AI alignment community in 60 years is not much larger than today. If mostly work done at “crunch-time” (in the 10 years before AGI) matters, then the world would not be in a better situation than in the short timelines scenario. If you could do productive work now to address this scenario, this would be pretty good (but you can’t, by assumption).

But if work done before crunchtime matters a lot, then even if the AI alignment community in 60 years is still small, we’ll probably at least have had 60 years of AI alignment work (from a small community). That’s much more than what we have in short timeline scenarios (e.g. 15 years from a small community)