Specifically, against the following view described by a comment:
There seems to be a lack of emphasis in this market on outcomes where alignment is not solved, yet humanity turns out fine anyway. Based on an Outside View perspective (where we ignore any specific arguments about AI and just treat it like any other technology with a lot of hype), wouldn’t one expect this to be the default outcome?
Take the following general heuristics:
If a problem is hard, it probably won’t be solved on the first try.
If a technology gets a lot of hype, people will think that it’s the most important thing in the world even if it isn’t. At most, it will only be important on the same level that previous major technological advancements were important.
People may be biased towards thinking that the narrow slice of time they live in is the most important period in history, but statistically this is unlikely.
If people think that something will cause the apocalypse or bring about a utopian society, historically speaking they are likely to be wrong.
This, if applied to AGI, leads to the following conclusions:
Nobody manages to completely solve alignment.
This isn’t a big deal, as AGI turns out to be disappointingly not that powerful anyway (or at most “creation of the internet” level influential but not “disassemble the planet’s atoms” level influential)
I would expect the average person outside of AI circles to default to this kind of assumption.
Ideally, details are provided for why the outside view presented here is less favored on the evidence than the idea that AGI or PASTA will be a big deal, as popularized by Holden Karnofsky. Also, ideally you can estimate how much impact AI will have say, this century.
Motivation: I’m asking this question because one thing I notice is that there’s the unstated assumption that AGI/AI will be a huge deal, and how much of a big deal would change virtually everything about LW works, depending on the answer. I’d really like to know why LWers hold that AGI/ASI will be a big deal.
Intuitively I would imagine that this should be weighted by something like population? So if we plot the distribution of humans vs time, doomers are saying that we are close to the median of the distribution, while optimists are saying that we are in the 0.0(multiple or many zeros)1% beginning of the distribution and that there is a huge world out there in the future.
Meta: I might be reading some the question incorrectly, but my impression is that it lumps “outside views about technology progress and hype cycles” together with “outside views about things people get doom-y about”.
If it is about “people being doom-y” about things, then I think we are more playing in the realm of things where getting it right on the first try or first few tries matter.
Expected values seem relevant here. If people think there is a 1% chance of a really bad outcome and try to steer against that, even if they are correct you are going to see 99 people pointing at things that didn’t turn out to be a detail for every 100 times this comes up. And if that 1 other person actually stopped something bad from happening, we’re much less likely to remember the time that “a bad thing failed to happen because it was stopped a few causal steps early”.
There also seems to be a thing there where the doom-y folks are part of the dynamic equilibrium. My mind goes to nuclear proliferation and climate change.
Folks got really worried about us all dying in a global nuclear war, and that has hasn’t happened yet, and so we might be tempted to conclude that the people who were worried were just panicking and were wrong. It seems likely to me that some part of the reason that we didn’t all die in a global nuclear war was that people were worried enough about that to collectively push over some unknowable-in-advance line where that lead to enough coordination to at least stop things going terminally bad with short notice. Even then, we’ve still had wobbles.
If the general response to the doom-y folks back then had been “Nah, it’ll be fine”, delivered with enough skill / volume / force to cause people to stop waving their warning flags and generally stop trying to do things, my guess is that we might have had much worse outcomes.
I am lumping them together since if you believe AGI isn’t that impactful, then much argumentation around AI and alignment doesn’t matter at all. Obviously, there is the bias argument that you responded to around doom, but there is another prong to that argument.
It seems—at least to to me—like the argumentation around AI and alignment would be a good source of new beliefs, since I can’t figure it all out on my own. People also seem to be figuring out new things fairly regularly.
Between those two things, I’m struggling to understand what it would be like to assert a static belief “field X doesn’t matter”, in way that is reasonably grounded in what is coming out of field X, particularly as the field X evolves.
Like, if I believe that AI Alignment won’t matter much and I use that to write off the field of AI Alignment, it feels like I’m either pre-emptively ignoring potentially relevant information, or I’m making a claim that I have some larger grounded insights into how the field is confused.
I get that we’re all bounded and don’t have the time or energy or inclination to engage with every field and every argument within those fields. If the claim was something like “I don’t see AI alignment as a personal priority to invest my time/energy in” that feels completely fine to me—I think I would have nodded and kept scrolling rather than writing something.
Worrying about where other people were spending their energy is also fine! If it were me, I’d want to be confident I was most informed about something they’d all missed, otherwise I’d be in a failure mode I sometimes get into where I’m on a not-so-well-grounded hamster wheel of worrying.
I guess I’m trying to tease apart the cases where you are saying “I have a belief that I’m not willing to spend time/energy to update” vs “I also believe that no updates are coming and so I’m locking in my current view based on that meta-belief”.
I’m also curious!
If you’ve seen something that would tip my evidential scales the whole way to “the field is built on sketchy foundations, with probability that balances out the expected value of doom if AI alignment is actually a problem”, then I’d really like to know! Although I haven’t seen anything like that yet.
And I’m also curious about what prongs I might be missing around the “people following their expected values to prevent P(doom) look like folks who were upset about nothing in the timelines where we all survived to be having after-the-fact discussions about them” ;)
I think the key here is that if AGI only is something like say the internet, or perhaps the industrial revolution, then AGI alignment doesn’t matter much. A lot of the field of AGI alignment only really makes sense if the impact of AGI is very, very large.