I can’t help with the object level determination, but I think you may be overrating both the balance and import of the second-order evidence.
As far as I can tell, Yudkowsky is a (?dramatically) pessimistic outlier among the class of “rationalist/rationalist-adjacent” SMEs in AI safety, and probably even more so relative to aggregate opinion without an LW-y filter applied (cf.). My impression of the epistemic track-record is Yudkowsky has a tendency of staking out positions (both within and without AI) with striking levels of confidence but not commensurately-striking levels of accuracy.
In essence, I doubt there’s much epistemic reason to defer to Yudkowsky more (or much more) than folks like Carl Shulman, or Paul Christiano, nor maybe much more than “a random AI alignment researcher” or “a superforecaster making a guess after watching a few Rob Miles videos” (although these have a few implied premises around difficulty curves/ subject matter expertise being relatively uncorrelated to judgemental accuracy).
I suggest ~all reasonable attempts at idealised aggregate wouldn’t take a hand-brake turn to extreme pessimism on finding Yudkowsky is. My impression is the plurality LW view has shifted more from “pretty worried” to “pessimistic” (e.g. p(screwed) > 0.4) rather than agreement with Yudkowsky, but in any case I’d attribute large shifts in this aggregate mostly to Yudkowsky’s cultural influence on the LW-community plus some degree of internet cabin fever (and selection) distorting collective judgement.
None of this is cause for complacency: even if p(screwed) isn’t ~1, > 0.1 (or 0.001) is ample cause for concern, and resolution on values between (say) [0.1 0.9] is informative for many things (like personal career choice). I’m not sure whether you get more yield for marginal effort on object or second-order uncertainty (e.g. my impression is the ‘LW cluster’ trends towards pessimism, so adjudicating whether this cluster should be over/under weighted could be more informative than trying to get up to speed on ELK). I would guess, though, that whatever distils out of LW discourse in 1-2 months will be much more useful than what you’d get right now.
If Yudkowsky is needlessly pessimistic, I guess we get an extra decade of time. How are we going to use it? Ten years later, will we feel just as hopeless as today, and hope that we get another extra decade?
This phrasing bothers me a bit. It presupposes that it is only a matter of time; that there’s no error about the nature of the threat AGI poses, and no order-of-magnitude error in the timeline. The pessimism is basically baked in.
Fair point. We might get an extra century. Until then, it may turn out that we can somehow deal with the problem, for example by having a competent and benevolent world government that can actually prevent the development of superhuman AIs (perhaps by using millions of exactly-human-level AIs who keep each other in check and together endlessly scan all computers on the planet).
I mean, a superhuman AI is definitely going to be a problem of some kind; at least economically and politically. But in best case, we may be able to deal with it. Either because we somehow got more competent quickly, or because we had enough time to become more competent gradually.
Maybe even this is needlessly pessimistic, but in such case I don’t see how it is.
I can’t help with the object level determination, but I think you may be overrating both the balance and import of the second-order evidence.
As far as I can tell, Yudkowsky is a (?dramatically) pessimistic outlier among the class of “rationalist/rationalist-adjacent” SMEs in AI safety, and probably even more so relative to aggregate opinion without an LW-y filter applied (cf.). My impression of the epistemic track-record is Yudkowsky has a tendency of staking out positions (both within and without AI) with striking levels of confidence but not commensurately-striking levels of accuracy.
In essence, I doubt there’s much epistemic reason to defer to Yudkowsky more (or much more) than folks like Carl Shulman, or Paul Christiano, nor maybe much more than “a random AI alignment researcher” or “a superforecaster making a guess after watching a few Rob Miles videos” (although these have a few implied premises around difficulty curves/ subject matter expertise being relatively uncorrelated to judgemental accuracy).
I suggest ~all reasonable attempts at idealised aggregate wouldn’t take a hand-brake turn to extreme pessimism on finding Yudkowsky is. My impression is the plurality LW view has shifted more from “pretty worried” to “pessimistic” (e.g. p(screwed) > 0.4) rather than agreement with Yudkowsky, but in any case I’d attribute large shifts in this aggregate mostly to Yudkowsky’s cultural influence on the LW-community plus some degree of internet cabin fever (and selection) distorting collective judgement.
None of this is cause for complacency: even if p(screwed) isn’t ~1, > 0.1 (or 0.001) is ample cause for concern, and resolution on values between (say) [0.1 0.9] is informative for many things (like personal career choice). I’m not sure whether you get more yield for marginal effort on object or second-order uncertainty (e.g. my impression is the ‘LW cluster’ trends towards pessimism, so adjudicating whether this cluster should be over/under weighted could be more informative than trying to get up to speed on ELK). I would guess, though, that whatever distils out of LW discourse in 1-2 months will be much more useful than what you’d get right now.
> the class of “rationalist/rationalist-adjacent” SMEs in AI safety,
What’s an SME?
It is a subject matter expert.
If Yudkowsky is needlessly pessimistic, I guess we get an extra decade of time. How are we going to use it? Ten years later, will we feel just as hopeless as today, and hope that we get another extra decade?
This phrasing bothers me a bit. It presupposes that it is only a matter of time; that there’s no error about the nature of the threat AGI poses, and no order-of-magnitude error in the timeline. The pessimism is basically baked in.
Fair point. We might get an extra century. Until then, it may turn out that we can somehow deal with the problem, for example by having a competent and benevolent world government that can actually prevent the development of superhuman AIs (perhaps by using millions of exactly-human-level AIs who keep each other in check and together endlessly scan all computers on the planet).
I mean, a superhuman AI is definitely going to be a problem of some kind; at least economically and politically. But in best case, we may be able to deal with it. Either because we somehow got more competent quickly, or because we had enough time to become more competent gradually.
Maybe even this is needlessly pessimistic, but in such case I don’t see how it is.