Sure, if somehow you don’t think foom is going to happen any year now, you also would think alignment is easy. At this point I’m not sure how you could believe that.
if somehow you don’t think foom is going to happen any year now, you also would think alignment is easy
Not necessarily. Given an expectation that previously somewhat effective alignment techniques stop working at AGI/ASI thresholds, and the no second chances issue with getting experimental feedback, alignment doesn’t become much easier even if it’s decades away. In that hypothetical there is time to build up theory for some chance of finding things that are helpful in advance, but it’s not predictably crucial.
I believe you’re saying that if foom is more than a few years away, it becomes easy to solve the alignment problem before then. I certainly think it becomes easier.
But the view that “foom more than a few years away → the alignment problem is easy” is not the one I expressed, which contained among other highly tentative assertions: “the alignment problem is hard → foom more than a few years away”, and the two are opposed in the sense that they have different truth values when alignment is hard. The distinction here is that the chances we will solve the alignment problem depend on the time to takeoff, and are not equivalent to the difficulty of the alignment problem.
So, you mentioned a causal link between time to foom and chances of solving alignment, which I agree on, but I am also asserting a “causal link” between difficulty of the alignment problem and time to foom (though the counterfactuals here may not be as well defined).
As for how you could possibly believe foom is not going to happen any year now: My opinion depends on precisely what you mean by foom and by “any year now” but I think I outlined scenarios where it takes ~25 years and ~60 years. Do you have a reason to think both of those are unlikely? It seems to me that hard takeoff within ~5 years relies on the assumptions I mentioned about recursive algorithmic improvement taking place near human level, and seems plausible, but I am not confident it will happen. How surprised will you be if foom doesn’t happen within 5 years?
I do expect the next 10 years to be pretty strange, but under the assumptions of the ~60 year scenario the status quo may not be completely upset that soon.
Sure, if somehow you don’t think foom is going to happen any year now, you also would think alignment is easy. At this point I’m not sure how you could believe that.
Not necessarily. Given an expectation that previously somewhat effective alignment techniques stop working at AGI/ASI thresholds, and the no second chances issue with getting experimental feedback, alignment doesn’t become much easier even if it’s decades away. In that hypothetical there is time to build up theory for some chance of finding things that are helpful in advance, but it’s not predictably crucial.
I believe you’re saying that if foom is more than a few years away, it becomes easy to solve the alignment problem before then. I certainly think it becomes easier.
But the view that “foom more than a few years away → the alignment problem is easy” is not the one I expressed, which contained among other highly tentative assertions: “the alignment problem is hard → foom more than a few years away”, and the two are opposed in the sense that they have different truth values when alignment is hard. The distinction here is that the chances we will solve the alignment problem depend on the time to takeoff, and are not equivalent to the difficulty of the alignment problem.
So, you mentioned a causal link between time to foom and chances of solving alignment, which I agree on, but I am also asserting a “causal link” between difficulty of the alignment problem and time to foom (though the counterfactuals here may not be as well defined).
As for how you could possibly believe foom is not going to happen any year now: My opinion depends on precisely what you mean by foom and by “any year now” but I think I outlined scenarios where it takes ~25 years and ~60 years. Do you have a reason to think both of those are unlikely? It seems to me that hard takeoff within ~5 years relies on the assumptions I mentioned about recursive algorithmic improvement taking place near human level, and seems plausible, but I am not confident it will happen. How surprised will you be if foom doesn’t happen within 5 years?
I do expect the next 10 years to be pretty strange, but under the assumptions of the ~60 year scenario the status quo may not be completely upset that soon.