I’m guessing the disagreement is that Yudkowsky thinks the holes are giant visible and gaping, whereas you think they are indeed holes but you have some ideas for how to fix them
I think we don’t know whether various obvious-to-us-now things will work with effort. I think we don’t really have a plan that would work with an acceptably high probability and stand up to scrutiny / mildly pessimistic assumptions.
I would guess that if alignment is hard, then whatever we do ultimately won’t follow any existing plan very closely (whether we succeed or not). I do think it’s reasonably likely to agree at a very high level. I think that’s also true even in the much better worlds that do have tons of plans.
at any rate the plan is to work on fixing those holes and to not deploy powerful AGI until those holes are fixed
I wouldn’t say there is “a plan” to do that.
Many people have that hope, and have thought some about how we might establish sufficient consensus about risk to delay AGI deployment for 0.5-2 years if things look risky, and how to overcome various difficulties with implementing that kind of delay, or what kind of more difficult moves might be able to delay significantly longer than that.
I think we don’t know whether various obvious-to-us-now things will work with effort. I think we don’t really have a plan that would work with an acceptably high probability and stand up to scrutiny / mildly pessimistic assumptions.
I would guess that if alignment is hard, then whatever we do ultimately won’t follow any existing plan very closely (whether we succeed or not). I do think it’s reasonably likely to agree at a very high level. I think that’s also true even in the much better worlds that do have tons of plans.
I wouldn’t say there is “a plan” to do that.
Many people have that hope, and have thought some about how we might establish sufficient consensus about risk to delay AGI deployment for 0.5-2 years if things look risky, and how to overcome various difficulties with implementing that kind of delay, or what kind of more difficult moves might be able to delay significantly longer than that.