Someone who is interested in learning and doing good.
My Twitter: https://twitter.com/MatthewJBar
My Substack: https://matthewbarnett.substack.com/
Someone who is interested in learning and doing good.
My Twitter: https://twitter.com/MatthewJBar
My Substack: https://matthewbarnett.substack.com/
Doesn’t the revealed preference argument also imply people don’t care much about dying from aging? (This is invested in even less than catastrophic risk mitigation and people don’t take interventions that would prolong their lives considerably.) I agree revealed preferences imply people care little about the long run future of humanity, but they do imply caring much more about children living full lives than old people avoiding aging.
I agree that the amount of funding explicitly designated for anti-aging research is very low, which suggests society doesn’t prioritize curing aging as a social goal. However, I think your overall conclusion is significantly overstated. A very large fraction of conventional medical research specifically targets health and lifespan improvements for older people, even though it isn’t labeled explicitly as “anti-aging.”
Biologically, aging isn’t a single condition but rather the cumulative result of multiple factors and accumulated damage over time. For example, anti-smoking campaigns were essentially efforts to slow aging by reducing damage to smokers’ bodies—particularly their lungs—even though these campaigns were presented primarily as life-saving measures rather than “anti-aging” initiatives. Similarly, society invests a substantial amount of time and resources in mitigating biological damage caused by air pollution and obesity.
Considering this broader understanding of aging, it seems exaggerated to claim that people aren’t very concerned about deaths from old age. I think public concern depends heavily on how the issue is framed. My prediction is that if effective anti-aging therapies became available and proven successful, most people would eagerly purchase them for high sums, and there would be widespread political support to subsidize those technologies.
Right now explicit support for anti-aging research is indeed politically very limited, but that’s partly because robust anti-aging technologies haven’t been clearly demonstrated yet. Medical technologies that have proven effective at slowing aging (even if not labeled as such) have generally been marketed as conventional medical technologies and typically enjoy widespread political support and funding.
I agree that delaying a pure existential risk that has no potential upside—such as postponing the impact of an asteroid that would otherwise destroy complex life on Earth—would be beneficial. However, the risk posed by AI is fundamentally different from something like an asteroid strike because AI is not just a potential threat: it also carries immense upside potential to improve and save lives. Specifically, advanced AI could dramatically accelerate the pace of scientific and technological progress, including breakthroughs in medicine. I expect this kind of progress would likely extend human lifespans and greatly enhance our quality of life.
Therefore, if we delay the development of AI, we are likely also delaying these life-extending medical advances. As a result, people who are currently alive might die of aging-related causes before these benefits become available. This is a real and immediate issue that affects those we care about today. For instance, if you have elderly relatives whom you love and want to see live longer, healthier lives, then—assuming all else is equal—it makes sense to want rapid medical progress to occur sooner rather than later.
This is not to say that we should accelerate AI recklessly and do it even if that would dramatically increase existential risk. I am just responding to your objection, which was premised on the idea that delaying AI could be worth it even if delaying AI doesn’t reduce x-risk at all.
It sounds like you’re talking about multi-decade pauses and imagining that people agree such a pause would only slightly reduce existential risk. But, I think a well timed safety motivated 5 year pause/slowdown (or shorter) is doable and could easily cut risk by a huge amount.
I suspect our core disagreement here primarily stems from differing factual assumptions. Specifically, I doubt that delaying AI development—even if timed well and if the delay were long in duration—would meaningfully reduce existential risk beyond a tiny amount. However, I acknowledge I haven’t said much to justify this claim here. Given this differing factual assumption, pausing AI development seems somewhat difficult to justify from a common-sense moral perspective, and very difficult to justify from a worldview that puts primary importance on people who currently exist.
My guess is that the “common sense” values tradeoff is more like 0.1% than 1% because of people caring more about kids and humanity having a future than defeating aging.
I suspect the common-sense view is closer to 1% than 0.1%, though this partly depends on how we define “common sense” in this context. Personally, I tend to look to revealed preferences as indicators of what people genuinely value. Consider how much individuals typically spend on healthcare and how much society invests in medical research relative to explicit existential risk mitigation efforts. There’s an enormous gap, suggesting society greatly values immediate survival and the well-being of currently living people, and places relatively lower emphasis on abstract, long-term considerations about species survival as a concern separate from presently existing individuals.
Politically, existential risk receives negligible attention compared to conventional concerns impacting currently-existing people. If society placed as much importance on the distant future as you’re suggesting, the US government would likely have much lower debt, and national savings rates would probably be higher. Moreover, if individuals deeply valued the flourishing of humanity independently of the flourishing of current individuals, we probably wouldn’t observe such sharp declines in birth rates globally.
None of these pieces of evidence alone are foolproof indicators that society doesn’t care that much about existential risk, but combined, they paint a picture of our society that’s significantly more short-term focused, and substantially more person-affecting than you’re suggesting here.
I care deeply about many, many people besides just myself (in fact I care about basically everyone on Earth), and it’s simply not realistic to expect that I can convince all of them to sign up for cryonics. That limitation alone makes it clear that focusing solely on cryonics is inadequate if I want to save their lives. I’d much rather support both the acceleration of general technological progress through AI, and cryonics in particular, rather than placing all hope in just one of those approaches.
Furthermore, curing aging would be far superior to merely making cryonics work. The process of aging—growing old, getting sick, and dying—is deeply unpleasant and degrading, even if one assumes a future where cryonic preservation and revival succeed. Avoiding that suffering entirely is vastly more desirable than having to endure it in the first place. Merely signing everyone up for cryonics would be insufficient to address this suffering, whereas I think AI could accelerate medicine and other technologies to greatly enhance human well-being.
The value difference commenters keep pointing out needs to be far bigger than they represent it to be, in order for it to justify increasing existential risk in exchange for some other gain.
I disagree with this assertion. Aging poses a direct, large-scale threat to the lives of billions of people in the coming decades. It doesn’t seem unreasonable to me to suggest that literally saving billions of lives is worth pursuing even if doing so increases existential risk by a tiny amount [ETA: though to be clear, I agree it would appear much more unreasonable if the reduction in existential risk were expected to be very large]. Loosely speaking, this idea only seems unreasonable to those who believe that existential risk is overwhelmingly more important than every other concern by many OOMs—so much so that it renders all other priorities essentially irrelevant. But that’s a fairly unusual and arguably extreme worldview, not an obvious truth.
I am essentially a preference utilitarian and an illusionist regarding consciousness. This combination of views leads me to conclude that future AIs will very likely have moral value if they develop into complex agents capable of long-term planning, and are embedded within the real world. I think such AIs would have value even if their preferences look bizarre or meaningless to humans, as what matters to me is not the content of their preferences but rather the complexity and nature of their minds.
When deciding whether to attribute moral patienthood to something, my focus lies primarily on observable traits, cognitive sophistication, and most importantly, the presence of clear open-ended goal-directed behavior, rather than on speculative or less observable notions of AI welfare, about which I am more skeptical. As a rough approximation, my moral theory aligns fairly well with what is implicitly proposed by modern economists, who talk about revealed preferences and consumer welfare.
Like most preference utilitarians, I believe that value is ultimately subjective: loosely speaking, nothing has inherent value except insofar as it reflects a state of affairs that aligns with someone’s preferences. As a consequence, I am comfortable, at least in principle, with a wide variety of possible value systems and future outcomes. This means that I think a universe made of only paperclips could have value, but only if that’s what preference-having beings wanted the universe to be made out of.
To be clear, I also think existing people have value too, so this isn’t an argument for blind successionism. Also, it would be dishonest not to admit that I am also selfish to a significant degree (along with almost everyone else on Earth). What I have just described simply reflects my broad moral intuitions about what has value in our world from an impartial point of view, not a prescription that we should tile the universe with paperclips. Since humans and animals are currently the main preference-having beings in the world, at the moment I care most about fulfilling what they want the world to be like.
Note that unborn people are merely potential, as their existence depends on our choices. Future generations aren’t guaranteed—we decide whether or not they will exist, particularly those who might be born decades or centuries from now. This makes their moral status far less clear than someone who already exists or who is certain to exist at some point regardless of our choices.
Additionally, if we decide to account for the value of future beings, we might consider both potential human people and future AI entities capable of having moral value. From a utilitarian perspective, both human and AI welfare presumably matters. This makes the ethical calculus more complicated, as the dilemma isn’t merely about whether we risk losing all future generations, but rather whether we risk shifting posterity from humans to AIs.
Personally, I’m largely comfortable evaluating our actions primarily—though not entirely—based on their impact on current human lives, or at least people (and animals) who will exist in the near-term. I value our present generation. I want us to keep living and to thrive. It would be a tragedy if we either went extinct or died from aging. However, to the extent that I care about distant future generations, my concern is substrate-impartial, and I don’t particularly favor humans over AIs.
For me, I’m at ~10% P(doom). Whether I’d accept a proposed slowdown depends on how much I expect it decrease this number.[2]
How do you model this situation? (also curious on your numbers)
I put the probability that AI will directly cause humanity to go extinct within the next 30 years at roughly 4%. By contrast, over the next 10,000 years, my p(doom) is substantially higher, as humanity could vanish for many different possible reasons, and forecasting that far ahead is almost impossible. I think a pause in AI development matters most for reducing the near-term, direct AI-specific risk, since the far-future threats are broader, more systemic, harder to influence, and only incidentally involve AI as a byproduct of the fact that AIs will be deeply embedded in our world.
I’m very skeptical that a one-year pause would meaningfully reduce this 4% risk. This skepticism arises partly because I doubt much productive safety research would actually happen during such a pause. In my view, effective safety research depends heavily on an active feedback loop between technological development and broader real-world applications and integration, and pausing the technology would essentially interrupt this feedback loop. This intuition is also informed by my personal assessment of the contributions LW-style theoretical research has made toward making existing AI systems safe—which, as far as I can tell, has been almost negligible (though I’m not implying that all safety research is similarly ineffective or useless).
I’m also concerned about the type of governmental structures and centralization of power required to enforce such a pause. I think pausing AI would seriously risk creating a much less free and dynamic world. Even if we slightly reduce existential risks by establishing an international AI pause committee, we should still be concerned about the type of world we’re creating through such a course of action. Some AI pause proposals seem far too authoritarian or even totalitarian to me, providing another independent reason why I oppose pausing AI.
Additionally, I think that when AI is developed, it won’t merely accelerate life-extension technologies and save old people’s lives; it will likely also make our lives vastly richer and more interesting. I’m excited about that future, and I want the 8 billion humans alive today to have the opportunity to experience it. This consideration adds another important dimension beyond merely counting potential lives lost, again nudging me towards supporting acceleration.
Overall, the arguments in favor of pausing AI seem surprisingly weak to me, considering the huge potential upsides from AI development, my moral assessment of the costs and benefits, my low estimation of the direct risk from misaligned AI over the next 30 years, and my skepticism about how much pausing AI would genuinely reduce AI risks.
“But with AI risk, the stakes put most of us on the same side: we all benefit from a great future, and we all benefit from not being dead.”
I appreciate this thoughtful perspective, and I think it makes sense, in some respects, to say we’re all on the same “side”. Most people presumably want a good future and want to avoid catastrophe, even if we have different ideas on how to get there.
That said, as someone who falls on the accelerationist side of things, I’ve come to realize that my disagreements with others often come down to values and not just facts. For example, a common disagreement revolves around the question: How bad would it be if by slowing down AI, we delay life-saving medical technologies that otherwise would have saved our aging parents (along with billions of other people) from death? Our answer to this question isn’t just empirical: it also reflects our moral priorities. Even if we agreed on all the factual predictions, how we weigh this kind of moral loss would still greatly affect our policy views.
Another recurring question is how to evaluate the loss incurred by the risk of unaligned AI: how bad would it be exactly if AI was not aligned with humans? Would such an outcome just be a bad outcome for us, like how aging and disease are bad to the people who experience it, or would it represent a much deeper and more tragic loss of cosmic significance, comparable to the universe never being colonized?
For both of these questions, I tend to fall on the side that makes acceleration look like the more rational choice, which can help explain my self-identification in that direction.
So while factual disagreements do matter, I think it’s important to recognize that value differences can run just as deep. And those disagreements can unfortunately put us on fundamentally different sides, despite surface-level agreement on abstract goals like “not wanting everyone to die”.
Previous discussion of this post can be found here.
It seems everyone has this problem with your writing
This is o3′s take, for what it’s worth:
Yes — Garrett Baker repeatedly and materially misrepresents what Matthew is saying.
I have custom instructions turned off, and I haven’t turned on the memory feature, so there’s no strong reason to expect it to behave sycophantically (that I’m aware of). And o3 said it doesn’t know which side I’m on. I expect most other LLMs will say something similar when given neutral prompts and the full context.
(Not that this is strong evidence. But I think it undermines your claim by at least a bit.)
It seems everyone has this problem with your writing, have you considered speaking more clearly or perhaps considering people understand you fully and it is you who are wrong?
I reject the premise. In general, my writing is interpreted significantly more accurately when I’m not signaling skepticism about AI risk on LessWrong. For most other topics, including on this site, readers tend to understand my points reasonably well, especially when the subject is less controversial.
This could perhaps mean I’m uniquely unclear when discussing AI risk. It’s also very plausible that the topic itself is unusually prone to misrepresentation. Still, I think a major factor is that people are often uncharitable toward unpopular viewpoints they strongly disagree with, which accounts for much of the pushback I receive on this subject.
I do not believe analogizing the positions of those who disagree with you with luddites from the 19th century (in particular when thousands of pages of publicly available writings, with which you are familiar, exist) is the best way to invite those conversations.
To clarify, I am not analogizing the positions of those who disagree with me with luddites from the 19th century. This is not my intention, nor was it my argument.
I think we’re talking past each other here, so I will respectfully drop this discussion.
Yes, I agree your whole comment sucks.
I wasn’t asking for your evaluation of the rest of my comment. I was clarifying a specific point because it seemed you had misunderstood what I was saying.
So we can get the observed shift with most of the “highly technical DL-specific considerations” mainly updating the p(AGI soon) factor via the incredibly complicated and arcane practice of… extrapolating benchmark scores.
Indeed, the fact AGI seems to be arriving so quickly is the main reason most people are worried!
If someone says their high p(doom) is driven by short timelines, what they likely mean is that AGI is now expected to arrive via a certain method—namely, deep learning—that is perceived as riskier than what might have emerged under slower or more deliberate development. If that’s the case, it directly supports my core point.
This explanation makes sense to me since expecting AGI to arrive soon doesn’t by itself justify a high probability of doom. After all, it would have been reasonable to have always believed AGI would come eventually, and it would have been unjustified to increase one’s p(doom) over time merely because time is passing.
There can be additional reasons deep learning is bad in their book, but is deep learning a core part of their arguments? Hell no! Do you know how I know? I’ve actually read them!
I think you’re conflating two distinct issues: first, what initially made people worry about AI risk at all; and second, what made people think doom is likely as opposed to merely a possibility worth taking seriously. I’m addressing the second point, not the first.
Please try to engage with what I’m actually saying, rather than continuing to misrepresent my position.
He has been warning of a significant risk of catastrophe for a long time, but unless I’m mistaken, he only began explicitly and primarily arguing for a high probability of catastrophe more recently, around the time deep learning emerged. This distinction is essential to my argument, and was highlighted explicitly by my comment.
I don’t think the mainline doom arguments claim to be rooted in deep learning?
To verify this claim, we can examine the blurb in Nate and Eliezer’s new book announcement, which states:
If any company or group, anywhere on the planet, builds an artificial superintelligence using anything remotely like current techniques, based on anything remotely like the present understanding of AI, then everyone, everywhere on Earth, will die.
From this quote, I draw two main inferences. First, their primary concern seems to be driven by the nature of existing deep learning technologies. [ETA: To be clear, I mean that it’s the primary factor driving their high p(doom), not that they’d be unconcerned about AI risk without deep learning.] This is suggested by their use of the phrase “anything remotely like current techniques”, which suggests that their core worries stem largely from deep learning rather than all potential AI development pathways. Second, the statement conveys a high degree of confidence in their prediction. This is evident in the fact that the claim is presented without any hedging or uncertainty—there are no phrases like “it’s possible that” or “we think this may occur.” The absence of such qualifiers implies that they see the outcome as highly probable, rather than speculative.
Now, imagine that, using only abstract reasoning available in the 19th century, someone could reasonably arrive at a 5% estimate for the likelihood that AI would pose an existential risk. Then suppose that, after observing the development and capabilities of modern deep learning, this estimate increases to 95%. In that case, I think it would be fair to say that the central or primary source of concern is rooted in the developments in deep learning, rather than in the original abstract arguments. That’s because the bulk of the concern emerged in response to concrete evidence from deep learning, and not from the earlier theoretical reasoning alone. I think this is broadly similar to MIRI’s position, although they may not go quite as far in attributing the shift in concern to deep learning alone.
Conversely, if someone already had a 95% credence in AI posing an existential threat based solely on abstract considerations from the 19th century—before the emergence of deep learning—then it would make more sense to say that their core concern is not based on deep learning at all. Their conviction would have been established independently of modern developments. This latter view is the one I was responding to in my original comment, as it seemed inconsistent with how others—including MIRI—have characterized the origin and basis of their concerns, as I’ve outlined above.
I am confused and feel like I must be misunderstanding your point. It feels like you’re attempting a “gotcha” argument, but I don’t understand your point or who you’re trying to criticize. It seems like bizarre rhetorical practice. It is not a valid argument to say that “people can hold position A for bad reason X, therefore all people who hold position A also hold it for bad reason X even if they claim it is for good reason Y”. But that seems to be your argument?
I think you’re overinterpreting my comment and attributing to me the least charitable plausible interpretation of what I wrote (along with most other people commenting and voting in this thread. As a general rule that I’ve learned from my time in online communities, whenever someone makes a claim on an online forum that indicates a rejection of a belief central to that forum’s philosophy, people tend to reply to that person by ruthlessly assuming the most foolish plausible interpretation of their remarks. LessWrong is no exception.)
My actual position is simply this: if the core arguments for AI doom could have genuinely been presented and anticipated in the 19th century, then the crucial factor that actually determines whether most “AI doomers” believe in AI doom is probably something relatively abstract or philosophical, rather than specific technical arguments grounded in the details of machine learning. This does not imply that technical arguments are irrelevant, it just means they’re probably not as cruxy to whether people actually believe that doom is probable or not.
(Also to be clear, unless otherwise indicated, in this thread I am using “belief in AI doom” as shorthand for “belief that AI doom is more likely than not” rather than “belief that AI doom is possible and at least a little bit plausible, so therefore worth worrying about.” I think these two views should generally be distinguished.)
(To clarify, I strong disagree voted, I haven’t downvoted at all—I still strongly disagree)
Oops, I recognize that, I just misstated it in my original comment.
You strong disagree downvoted my comment, but it’s still not clear to me that you actually disagree with my core claim. I’m not making a claim about priors, or whether it’s reasonable to think that p(doom) might be non-negligible a priori.
My point is instead about whether the specific technical details of deep learning today are ultimately what’s driving some people’s high probability estimates of AI doom. If the intuition behind these high estimates could’ve been provided in the 19th century (without modern ML insights), then modern technical arguments don’t seem to be the real crux.
Therefore, while you might be correct about priors regarding p(doom), or whether existing evidence reinforces high concern for AI doom, these points seem separate from my core claim about the primary motivating intuitions behind a strong belief in AI doom.
If we accept your interpretation—that AI doom is simply the commonsense view—then doesn’t that actually reinforce my point? It suggests that the central concern driving AI doomerism isn’t a set of specific technical arguments grounded in the details of deep learning. Instead, it’s based on broader and more fundamental intuitions about the nature of artificial life and its potential risks. To borrow your analogy: the belief that a brick falling on someone’s head would cause them harm isn’t ultimately rooted in technical disputes within Newtonian mechanics. It’s based on ordinary, everyday experience. Likewise, our conversations about AI doom should focus on the intuitive, commonsense cruxes behind it, rather than pretending that the real disagreement comes from highly specific technical deep learning arguments. Instead of undermining my comment, I think your point actually strengthens it.
“Maybe a thing smarter than humans will eventually displace us” is really not a very complicated argument, and no one is claiming it is. So it should be part of our hypothesis class, and various people like Turing thought of it well before modern ML.
This is a claim about what is possible, but I am talking about what people claim is probable. If the core idea of “AI doomerism” is that AI doom is merely possible, then I agree: little evidence is required to believe the claim. In this case, it would be correct to say that someone from the 19th century could indeed have anticipated the arguments for AI doom being possible, as such a claim would be modest and hard to argue against.
Yet a critical component of modern AI doomerism is not merely about what’s possible, but what is likely to occur: many people explicitly assert that AI doom is probable, not merely possible. My point is that if the core reasons supporting this stronger claim could have been anticipated in the 19th century, then it is a mistake to think that the key cruxes generating disagreement about AI doom hinge on technical arguments specific to contemporary deep learning.
It would be helpful if you could clearly specify the “basic argumentation mistakes” you see in the original article. The parent comment mentioned two main points: (1) the claim that I’m being misleading by listing costs of an LVT without comparing them to benefits, and (2) the claim that an LVT would likely replace existing taxes rather than add to them.
If I’m wrong on point (2), that would likely stem from complex empirical issues, not from a basic argumentation mistake. So I’ll focus on point (1) here.
Regarding (1), my article explicitly stated that its purpose was not to offer a balanced evaluation, but to highlight potential costs of an LVT that I believe are often overlooked or poorly understood. This is unlike the analogy given, where someone reviews a car only by noting its price while ignoring its features. The disanalogy is that, in the case of a car review, the price is already transparent and easily verifiable. However, with an LVT, the costs are often unclear, downplayed, or poorly communicated in public discourse, at least in my experience on twitter.
By pointing out these underdiscussed costs, I’m aiming to provide readers with information they may not have encountered, helping them make a more informed overall judgment. Moreover, I explicitly and prominently linked to a positive case for an LVT and encouraged readers to compare both perspectives to reach a final conclusion.
A better analogy would be a Reddit post warning that although a car is advertised at $30,000, the true cost is closer to $60,000 after hidden fees are included. That post would still add value, even if it doesn’t review the car in full, because it would provide readers valuable information they might not be familiar with. Likewise, my article aimed to contribute by highlighting costs of an LVT that might otherwise go unnoticed.