I can’t speak for anyone else, but for me at least the answers to most of your questions correlate because of my underlying model. However, it seems like correlation on most these questions (not just a few pairs) is to be expected, as many of them probe a similar underlying model from a few different directions. Just because there are lots of individual arguments about various axes does not immediately imply that any or even most configurations of beliefs on those arguments is coherent. In fact, in a very successfully truth-finding system we should also expect convergence on most axes of argument, with only a few cruxes remaining.
My answers to the questions:
1. I lean towards short timelines, though I have lots of uncertainty.
2. I mostly operate under the assumption of fast takeoffs. I think “slower” takeoffs (months to low years) are plausible but I think even in this world we’re unlikely to respond adequately and that iterative approaches still fail.
3: I expect values to be fragile. However, my model of how things go wrong depends on a much weaker version of this claim.
4: I don’t expect teaching a value system to an AGI to be hard or relevant to the reasons why things might go wrong.
5: I expect corrigibility to be hard.
6: Yes (depending on the definition of “simple”, they already happen), but I don’t expect it to update people sufficiently.
7: >90%, if you condition on alignment not being solved (which also precludes alignment being solved because it’s easy).
8: If you factor out the feasibility of implementation/enforcement, then merely designing a good governance mechanism is relatively easy. However, I’m not sure why you would ever want to do this factorization.
9: I expect implementation/enforcement to be hard.
Insofar as that’s what he thinks, he’s wrong. Short timelines and fast takeoff are positively correlated. However, he argues (correctly) that slow takeoff also causes short timelines, by accelerating R&D progress sooner than it otherwise would have (for a given level of underlying difficulty-of-building-AGI). This effect is not strong enough in my estimation (or in Paul’s if I recall from conversations*) to overcome the other reasons pushing timelines and takeoff to be correlated.
*I think I remember him saying things like “Well yeah obviously if it happens by 2030 it’s probably gonna be a crazy fast takeoff.”
I can’t speak for anyone else, but for me at least the answers to most of your questions correlate because of my underlying model. However, it seems like correlation on most these questions (not just a few pairs) is to be expected, as many of them probe a similar underlying model from a few different directions. Just because there are lots of individual arguments about various axes does not immediately imply that any or even most configurations of beliefs on those arguments is coherent. In fact, in a very successfully truth-finding system we should also expect convergence on most axes of argument, with only a few cruxes remaining.
My answers to the questions:
1. I lean towards short timelines, though I have lots of uncertainty.
2. I mostly operate under the assumption of fast takeoffs. I think “slower” takeoffs (months to low years) are plausible but I think even in this world we’re unlikely to respond adequately and that iterative approaches still fail.
3: I expect values to be fragile. However, my model of how things go wrong depends on a much weaker version of this claim.
4: I don’t expect teaching a value system to an AGI to be hard or relevant to the reasons why things might go wrong.
5: I expect corrigibility to be hard.
6: Yes (depending on the definition of “simple”, they already happen), but I don’t expect it to update people sufficiently.
7: >90%, if you condition on alignment not being solved (which also precludes alignment being solved because it’s easy).
8: If you factor out the feasibility of implementation/enforcement, then merely designing a good governance mechanism is relatively easy. However, I’m not sure why you would ever want to do this factorization.
9: I expect implementation/enforcement to be hard.
What do you think of Paul Christiano’s argument that short timeline and fast takeoff is anti-correlated?
Insofar as that’s what he thinks, he’s wrong. Short timelines and fast takeoff are positively correlated. However, he argues (correctly) that slow takeoff also causes short timelines, by accelerating R&D progress sooner than it otherwise would have (for a given level of underlying difficulty-of-building-AGI). This effect is not strong enough in my estimation (or in Paul’s if I recall from conversations*) to overcome the other reasons pushing timelines and takeoff to be correlated.
*I think I remember him saying things like “Well yeah obviously if it happens by 2030 it’s probably gonna be a crazy fast takeoff.”
did you mean “short timelines” in your third sentence?
Ooops yeah sorry typo will fix