Matthew Barnett

Karma: 10,640

Someone who is interested in learning and doing good.

My Twitter: https://twitter.com/MatthewJBar

My Substack: https://matthewbarnett.substack.com/

Matthew Barnett May 15, 2025, 1:25 AM
7 points
0
in reply to: Adam Scholl’s comment on: AI Doomerism in 1879
There appears to be a motte-and-bailey worth unpacking. The weaker, easily defensible claim is that advanced AI could be risky or dangerous. This modest assertion requires little evidence, similar to claims that extraterrestrial aliens, advanced genetic engineering of humans, or large-scale human cloning might be dangerous. I do not dispute this modest claim.
The stronger claim about AI doom is that doom is likely rather than merely possible. This substantial claim demands much stronger evidence than the weaker claim. The tension I previously raised addresses this stronger claim of probable AI doom (“AI doomerism”), not the weaker claim that advanced AI might be risky.
Many advocates of the strong claim of AI doom explicitly assert that their belief is backed by technical arguments, such as the counting argument for scheming behavior in SGD, among other arguments. However, if the premise of AI doom does not, in fact, rely on such technical arguments, then it is a mistake to argue about these ideas as if they are the key cruxes generating disagreement about AI doom.

Matthew Barnett May 14, 2025, 8:53 PM
9 points
−88
on: AI Doomerism in 1879
chapter 17 surprised me for how well it anticipated modern AI doomerism
It’s perhaps worth highlighting the significant tension between two contrasting claims: on the one hand, the idea that modern AI doomerism was “anticipated” as early as the 19th century, and on the other, the idea that modern AI doom arguments are rationally grounded in a technical understanding of today’s deep learning systems. If the core concerns about AI doom were truly foreseen over a century ago, long before any of the technical details of modern machine learning existed, then I suggest the arguments can’t really be based on those technical details in any deep or meaningful way.
One way to resolve this contradiction is to posit that AI doom arguments are not fundamentally about technical aspects at all, but are instead rooted in a broader philosophical stance—namely, that artificial life is by default bad, dangerous, or disvaluable (for example by virtue of lacking consciousness, or by virtue of being cold and calculating), while biological life is by default good or preferable. However, when framed in this way, the arguments lose much of their perceived depth and rigor, and look more like raw intuition-backed reactions to the idea of mechanical minds than tenable theories.

Matthew Barnett Apr 25, 2025, 11:04 PM
5 points
1
in reply to: MichaelDickens’s comment on: MichaelDickens’s Shortform
An important difference between the analogy you gave and our real situation is that non-Americans actually exist right now, whereas future human generations do not yet exist and they may never actually come into existence—they are merely potential. Their existence depends on the choices we make today. A closer analogy would be choosing an 80% chance of making all humans immortal and a 20% chance of eliminating the possibility of future space colonization. Framed this way, I don’t think the choice to take such a gamble should be considered selfish or even short-sighted, though I understand that many people would still not want to take that gamble.

Matthew Barnett Apr 22, 2025, 5:33 AM
3 points
0
in reply to: Tenoke’s comment on: Most AI value will come from broad automation, not from R&D
It’s important to be precise about the specific claim we’re discussing here.
The claim that R&D is less valuable than broad automation is not equivalent to the claim that technological progress itself is less important than other forms of value. This is because technological progress is sustained not just by explicit R&D but by large-scale economic forces that complement the R&D process, such as general infrastructure, tools, and complementary labor used to support the invention, implementation, and deployment of various technologies. These complementary factors make it possible to both run experiments that enable the development of technologies and diffuse these technologies widely after they are developed in a laboratory environment—providing straightforwardly large value.
To provide a specific operationalization of our thesis, we can examine the elasticity of economic output with respect to different inputs—that is, how much economic value increases when a particular input to economic output is scaled. The thesis here is that automating R&D alone would, by itself, raise output by significantly less than automating labor broadly (separately from R&D). This is effectively what we mean when we say R&D has “less value” than broad automation.

Matthew Barnett Apr 22, 2025, 4:38 AM
4 points
0
in reply to: Tenoke’s comment on: Most AI value will come from broad automation, not from R&D
Does he not believe in AGI and Superintelligence at all? Why not just say that?
As one of the authors, I’ll answer for myself. Unfortunately, I’m not exactly sure what these terms mean precisely, so I’ll answer a different question instead. If your question is whether I believe that AIs will eventually match or surpass human performance—either collectively or individually—across the full range of tasks that humans are capable of performing, then my answer is yes. I do believe that, in the long run, AI systems will reach or exceed human-level performance across virtually all domains of ability.
However, I fail to see how this belief directly supports the argument you are making in your comment. Even if we accept that AIs will eventually be highly competent across essentially all important tasks, that fact alone does not straightforwardly imply that our core thesis—that the main value from AI will come from broad automation rather than the automation of R&D—is incorrect.

Most AI value will come from broad automation, not from R&D

Matthew BarnettApr 22, 2025, 3:22 AM

10 points

6 comments2 min readLW link

(epoch.ai)

Matthew Barnett Apr 22, 2025, 12:02 AM
4 points
0
in reply to: plex’s comment on: aog’s Shortform
It’s worth noting that I updated towards shorter timelines a few years ago. I don’t know exactly what you’re referring to when you talk about a “disgust reaction towards claims of short timelines [rather] than open curious engagement” (and I predictably disagree with your assessment) but I’d be open to seeing examples that could help demonstrate this claim.

Matthew Barnett Apr 18, 2025, 9:56 PM
6 points
0
in reply to: habryka’s comment on: jacquesthibs’s Shortform
I was pushing back against the ambiguous use of the word “they”. That’s all.
ETA: I edited the original comment to be more clear.

Matthew Barnett Apr 18, 2025, 9:51 PM
9 points
1
in reply to: habryka’s comment on: jacquesthibs’s Shortform
“They” is referring to Epoch as an entity, which the comment referenced directly. My guess is you just missed that?
I didn’t miss it. My point is that Epoch has a variety of different employees and internal views.

Matthew Barnett Apr 18, 2025, 9:45 PM
13 points
5
in reply to: habryka’s comment on: jacquesthibs’s Shortform
They have definitely described themselves as safety focused to me and others.
The original comment referenced (in addition to Epoch), “Matthew/Tamay/Ege”, yet you quoted Jaime to back up this claim. I think it’s important to distinguish who has said what when talking about what “they” have said. I for one have been openly critical of LW arguments for AI doom for quite a while now.
[I edited this comment to be clearer]

Matthew Barnett Jan 29, 2025, 7:34 PM
4 points
0
in reply to: mako yass’s comment on: Capital Ownership Will Not Prevent Human Disempowerment
But anyway, it sometimes seems to me that you often advocate a morality regarding AI relations that doesn’t benefit anyone who currently exists, or, the coalition that you are a part of. This seems like a mistake. Or worse.
I dispute this, since I’ve argued for the practical benefits of giving AIs legal autonomy, which I think would likely benefit existing humans. Relatedly, I’ve also talked about how I think hastening the arrival AI could benefit people who currently exist. Indeed, that’s one of the best arguments for accelerating AI. The argument is that, by ensuring AI arrives sooner, we can accelerate the pace of medical progress, among other useful technologies. This could ensure that currently-existing old people who would otherwise die without AI will be saved and live a longer and healthier life than the alternative.
(Of course, this must be weighed against concerns about AI safety. I am not claiming that there is no tradeoff between AI safety and acceleration. Rather, my point is that, despite the risks, accelerating AI could still be the preferable choice.)
However, I do think there is an important distinction here to make between the following groups:
1. The set of all existing humans
2. The human species itself, including all potential genetic descendants of existing humans
Insofar as I have loyalty towards a group, I have much more loyalty towards (1) than (2). It’s possible you think that I should see myself as belonging to the coalition comprised of (2) rather than (1), but I don’t see a strong argument for that position.
To the extent it makes sense to think of morality as arising from game theoretic considerations, there doesn’t appear to be much advantage for me in identifying with the coalition of all potential human descendants (group 2) rather than with the coalition of currently existing humans plus potential future AIs (group 1 + AIs) . If we are willing to extend our coalition to include potential future beings, then I would seem to have even stronger practical reasons to align myself with a coalition that includes future AI systems. This is because future AIs will likely be far more powerful than any potential biological human descendants.
I want to clarify, however, that I don’t tend to think of morality as arising from game theoretic considerations. Rather, I mostly think of morality as simply an expression of my personal preferences about the world.

Matthew Barnett Jan 29, 2025, 5:46 AM
7 points
3
in reply to: mako yass’s comment on: Capital Ownership Will Not Prevent Human Disempowerment
Are you suggesting that I should base my morality on whether I’ll be rewarded for adhering to it? That just sounds like selfishness disguised as impersonal ethics.
To be clear, I do have some selfish/non-impartial preferences. I care about my own life and happiness, and the happiness of my friends and family. But I also have some altruistic preferences, and my commentary on AI tends to reflect that.

Matthew Barnett Jan 19, 2025, 8:17 AM
8 points
3
in reply to: Mateusz Bagiński’s comment on: meemi’s Shortform
I’m not completely sure, since I was not personally involved in the relevant negotiations for FrontierMath. However, what I can say is that Tamay already indicated that Epoch should have tried harder to obtain different contract terms that enabled us to have greater transparency. I don’t think it makes sense for him to say that unless he believes it was feasible to have achieved a different outcome.
Also, I want to clarify that this new benchmark is separate from FrontierMath and we are under different constraints with regards to it.

Matthew Barnett Jan 19, 2025, 5:46 AM
4 points
6
in reply to: Daniel Kokotajlo’s comment on: meemi’s Shortform
I can’t make any confident claims or promises right now, but my best guess is that we will make sure this new benchmark stays entirely private and under Epoch’s control, to the extent this is feasible for us. However, I want to emphasize that by saying this, I’m not making a public commitment on behalf of Epoch.

Matthew Barnett Jan 19, 2025, 3:09 AM
12 points
4
in reply to: Kei’s comment on: meemi’s Shortform
Having hopefully learned from our mistakes regarding FrontierMath, we intend to be more transparent to collaborators for this new benchmark. However, at this stage of development, the benchmark has not reached a point where any major public disclosures are necessary.

Matthew Barnett Jan 17, 2025, 11:30 PM
6 points
2
in reply to: DragonGod’s comment on: We probably won’t just play status games with each other after AGI
I suppose that means it might be worth writing an additional post that more directly responds to the idea that AGI will end material scarcity. I agree that thesis deserves a specific refutation.

Matthew Barnett Jan 15, 2025, 8:20 AM
4 points
2
in reply to: tailcalled’s comment on: We probably won’t just play status games with each other after AGI
This seems less like a normal friendship and more like a superstimulus simulating the appearance of a friendship for entertainment value. It seems reasonable enough to characterize it as non-authentic.
I assume some people people will end up wanting to interact with a mere superstimulus; however, other people will value authenticity and variety in their friendships and social experiences. This comes down to human preferences, which will shape the type of AIs we end up training.
The conclusion that nearly all AI-human friendships will seem inauthentic thus seems unwarranted. Unless the superstimulus is irresistible, then it won’t be the only type of relationship people will have.
Since most people already express distaste at non-authentic friendships with AIs, I assume there will be a lot of demand for AI companies to train higher quality AIs that are not superficial and pliable in the way you suggest. These AIs would not merely appear independent but would literally be independent in the same functional sense that humans are, if indeed that’s what consumers demand.
This can be compared to addictive drugs and video games, which are popular, but not universally viewed as worthwhile pursuits. In fact, many people purposely avoid trying certain drugs to avoid getting addicted: they’d rather try to enjoy what they see as richer and more meaningful experiences from life instead.

We probably won’t just play status games with each other after AGI

Matthew BarnettJan 15, 2025, 4:56 AM

94 points

21 comments4 min readLW link

Matthew Barnett Jan 14, 2025, 5:20 AM
2 points
0
in reply to: eggsyntax’s comment on: Human takeover might be worse than AI takeover
They might be about getting unconditional love from someone or they might be about having everyone cowering in fear, but they’re pretty consistently about wanting something from other humans (or wanting to prove something to other humans, or wanting other humans to have certain feelings or emotions, etc)
I agree with this view, however, I am not sure it rescues the position that a human who succeeds in taking over the world would not pursue actions that are extinction-level bad.
If such a person has absolute power in the way assumed here, their strategies to get what they want would not be limited to nice and cooperative strategies with the rest of the world. As you point out, an alternative strategy could be to cause everyone else to cower in fear or submission, which is indeed a common strategy for dictators.
and my guess is that getting simulations of those same things from AI wouldn’t satisfy those desires.
My prediction is that people will find AIs to be just as satisfying to be peers with compared to humans. In fact, I’d go further: for almost any axis you can mention, you could train an AI that is superior to humans along that axis, who would make a more interesting and more compelling peer.
I think you are downplaying AI by calling what it offers a mere “simulation”: there’s nothing inherently less real about a mind made of silicon compared to a mind made of flesh. AIs can be funnier, more attractive, more adventurous, harder working, more social, friendlier, more courageous, and smarter than humans, and all of these traits serve as sufficient motives for a uncaring dictator to replace their human peers with AIs.

Matthew Barnett Jan 14, 2025, 1:02 AM
23 points
16
in reply to: eggsyntax’s comment on: Human takeover might be worse than AI takeover
But we certainly have evidence about what humans want and strive to achieve, eg Maslow’s hierarchy and other taxonomies of human desire. My sense, although I can’t point to specific evidence offhand, is that once their physical needs are met, humans are reliably largely motivated by wanting other humans to feel and behave in certain ways toward them.
I think the idea that most people’s “basic needs” can ever be definitively “met”, after which they transition to altruistic pursuits, is more or less a myth. In reality, in modern, wealthy countries where people have more than enough to meet their physical needs—like sufficient calories to sustain themselves—most people still strive for far more material wealth than necessary to satisfy their basic needs, and they do not often share much of their wealth with strangers.
(To clarify: I understand that you may not have meant that humans are altruistic, just that they want others to “feel and behave in certain ways toward them”. But if this desire is a purely selfish one, then I would be very fearful of how it would be satisfied by a human with absolute power.)
The notion that there’s a line marking the point at which human needs are fully met oversimplifies the situation. Instead, what we observe is a constantly shifting and rising standard of what is considered “basic” or essential. For example, 200 years ago, it would have been laughable to describe air conditioning in a hot climate as a basic necessity; today, this view is standard. Similarly, someone like Jeff Bezos (though he might not say it out loud) might see having staff clean his mansion as a “basic need”, whereas the vast majority of people who are much poorer than him would view this expense as frivolous.
One common model to make sense of this behavior is that humans get logarithmic utility in wealth. In this model, extra resources have sharply diminishing returns to utility, but humans are nonetheless insatiable: the derivative of utility with respect to wealth is always positive, at every level of wealth.
Now, of course, it’s clear that many humans are also altruistic to some degree, but:
1. Among people who would be likely to try to take over the world, I expect them to be more like brutal dictators than like the median person. This makes me much more worried about what a human would do if they tried and succeeded in taking over the world.
2. Common apparent examples of altruism are often explained easily as mere costless signaling, i.e. cheap talk, rather than genuine altruism. Actively sacrificing one’s material well-being for the sake of others is much less common than merely saying that you care about others. This can be explained by the fact that merely saying that you care about others costs nothing selfishly. Likewise, voting for a candidate who promises to help other people is not significant evidence of altruism, since it selfishly costs almost nothing for an individual to vote for such a politician.
  
  Humanity is a cooperative species, but not necessarily an altruistic one.

Matthew Barnett

Most AI value will come from broad au­toma­tion, not from R&D

We prob­a­bly won’t just play sta­tus games with each other af­ter AGI

Most AI value will come from broad automation, not from R&D

We probably won’t just play status games with each other after AGI