My blog is here. You can subscribe for new posts there.
My personal site is here.
My X/Twitter is here
You can contact me using this form.
My blog is here. You can subscribe for new posts there.
My personal site is here.
My X/Twitter is here
You can contact me using this form.
To be somewhat more fair, the worry here is that in a regime where you don’t need society anymore because AIs can do all the work for your society, value conflicts become a bigger deal than today, because there is less reason to tolerate other people’s values if you can just found your own society based on your own values, and if you believe in the vulnerable world hypothesis, as a lot of rationalists do, then conflict has existential stakes, and even if not, can be quite bad, so one group controlling the future is better than inevitable conflict.
So to summarise: if we have a multipolar world, and the vulnerable world hypothesis if true, then conflict can be existentially bad and this is a reason to avoid a multipolar world. Didn’t consider this, interesting point!
At a foundational level, whether or not our current tolerance for differing values is stable ultimately comes down to we can compensate for the effect of AGI allowing people to make their own society.
Considerations:
offense/defense balance (if offense wins very hard, it’s harder to let everyone do their own thing)
tunability-of-AGI-power / implementability of the harm principle (if you can give everyone AGI that can follow very well the rule “don’t let these people harm other people”, then you can give that AGI safely to everyone and they can build planets however they like but not death ray anyone else’s planets)
The latter might be more of a “singleton that allows playgrounds” rather an actual multipolar world though.
Some of my general worries with singleton worlds are:
humanity has all its eggs in one basket—you better hope the governance structure is never corrupted, or never becomes sclerotic; real-life institutions so far have not given me many signs of hope on this count
cultural evolution is a pretty big part of how human societies seem to have improved and relies on a population of cultures / polities
vague instincts towards diversity being good and less fragile than homogeneity or centralisation
Comment is also on substack:
Thanks!
This post seems to misunderstand what it is responding to
fwiw, I see this post less as “responding” to something, and more laying out considerations on their own with some contrasting takes as a foil.
(On Substack, the title is “Capital, AGI, and human ambition”, which is perhaps better)
that material needs will likely be met (and selfish non-positional preferences mostly satisfied) due to extreme abundance (if humans retain control).
I agree with this, though I’d add: “if humans retain control” and some sufficient combination of culture/economics/politics/incentives continues opposing arbitrary despotism.
I also think that even if all material needs are met, avoiding social stasis and lock-in matters.
Scope sensitive preferences
Scope sensitivity of preferences is a key concept that matters here, thanks for pointing that out.
Various other considerations about types of preferences / things you can care about (presented without endorsement):
instrumental preference to avoid stasis because of a belief it leads to other bad things (e.g. stagnant intellectual / moral / political / cultural progress, increasing autocracy)
altruistic preferences combined with a fear that less altruism will result if today’s wealth hierarchy is locked in, than if social progress and disruption continued
a belief that it’s culturally good when human competition has some anchoring to object-level physical reality (c.f. the links here)
a general belief in a tendency for things to go off the rails without a ground-truth unbeatable feedback signal that the higher-level process needs to be wary of—see Gwern’s Evolution as a backstop for RL
preferences that become more scope-sensitive due to transhumanist cognitive enhancement
positional preferences, i.e. wanting to be higher-status or more something than some other human(s)
a meta-positional-preference that positions are not locked in, because competition is fun
a preference for future generations having at least as much of a chance to shape the world, themselves, and their position as the current generation
an aesthetic preference for a world where hard work is rewarded, or rags-to-riches stories are possible
However, note that if these preferences are altruistic and likely to be the kind of thing other people might be sympathetic to, personal savings are IMO likely to be not-that-important relative to other actions.
I agree with this on an individual level. (On an org level, I think philanthropic foundations might want to consider my arguments above for money buying more results soon, but this needs to be balanced against higher leverage on AI futures sooner rather than later.)
Further, I do actually think that the default outcome is that existing governments at least initially retain control over most resources such that capital isn’t clearly that important, but I won’t argue for this here (and the post does directly argue against this).
Where do I directly argue against that? A big chunk of this post is pointing out how the shifting relative importance of capital v labour changes the incentives of states. By default, I expect states to remain the most important and powerful institutions, but the frame here is very much human v non-human inputs to power and what that means for humans, without any particular stance on how the non-human inputs are organised. I don’t think states v companies v whatever fundamentally changes the dynamic; with labour-replacing AI power flows from data centres, other physical capital, and whoever has the financial capital to pay for it, and sidesteps humans doing work, and that is the shift I care about.
(However, I think which institutions do the bulk of decision-making re AI does matter for a lot of other reasons, and I’d be very curious to get your takes on that)
My guess is that the most fundamental disagreement here is about how much power tries to get away with when it can. My read of history leans towards: things are good for people when power is correlated with things being good for people, and otherwise not (though I think material abundance is very important too and always helps a lot). I am very skeptical of the stability of good worlds where incentives and selection pressures do not point towards human welfare.
For example, assuming a multipolar world where power flows from AI, the equilibrium is putting all your resources on AI competition and none on human welfare. I don’t think it’s anywhere near certain we actually reach that equilibrium, since sustained cooperation is possible (c.f. Ostrom’s Governing the Commons), and since a fairly trivial fraction of the post-AGI economy’s resources might suffice for human prosperity (and since maybe we in fact do get a singleton—but I’d have other issues with that). But this sort of concern still seems neglected and important to me.
Thanks for this link! That’s a great post
If you have [a totalising worldview] too, then it’s a good exercise to put it into words. What are your most important Litanies? What are your noble truths?
The Straussian reading of Yudkowsky is that this does not work. Even if your whole schtick is being the arch-rationalist, you don’t get people on board by writing out 500 words explicitly summarising your worldview. Even when you have an explicit set of principles, it needs to have examples and quotes to make it concrete (note how many people Yudkowsky quotes and how many examples he gives in the 12 virtues piece), and be surrounded by other stuff that (1) brings down the raw cognitive inferential distance, and (2) gives it life through its symbols / Harry-defeating-the-dementor stories / examples of success / cathedrals / thumos.
It is possible that writing down the explicit summary can be actively bad for developing it, especially if it’s vague / fuzzy / early-stages / not-fully-formed. Ideas need time to gestate, and an explicit verbal form is not always the most supportive container.
Every major author who has influenced me has “his own totalising and self-consistent worldview/philosophy”. This list includes Paul Graham, Isaac Asimov, Joel Spolsky, Brett McKay, Shakyamuni, Chuck Palahniuk, Bryan Caplan, qntm, and, of course, Eliezer Yudkowsky, among many others.
Maybe this is not the distinction you’re focused on, but to me there’s a difference between thinkers who have a worldview/philosophy, and ones that have a totalising one that’s an entire system of the world.
Of your list, I only know of Graham, Asimov, Caplan, and, of course, Yudkowsky. All of them have a worldview, yes, and Caplan’s maybe a bit of the way towards a “system of the world” because he does seem to have a overall coherent perspective on economics, politics, education, and culture (though perhaps not very differentiated from other libertarian economists?).
Paul Graham definitely gets a lot of points for being right about many startup things before others and contrarian in the early days of Y Combinator, but he seems to me mainly an essayist with domain-specific correct takes about startups, talent, aesthetics, and Lisp rather than someone out to build a totalising philosophy of the world.
My impression of Asimov is that he was mainly a distiller and extrapolator of mid-century modernist visions of progress and science. To me, authors like Vernor Vinge are far more prophetic, Greg Egan is far more technically deep, Heinlein was more culturally and politically rich, Clarke was more diverse, and Neal Stephenson just feels smarter while being almost equally trend-setting as Asimov.
I’d be curious to hear if you see something deeper or more totalising in these people?
I copy-pasted markdown from the dev version of my own site, and the images showed up fine on my computer because I was running the dev server; images now fixed to point to the Substack CDN copies that the Substack version uses. Sorry for that.
Images issues now fixed, apologies for that
Thanks for the review! Curious what you think the specific fnords are—the fact that it’s very space-y?
What do you expect the factories to look like? I think an underlying assumption in this story is that tech progress came to a stop on this world (presumably otherwise it would be way weirder, and eventually spread to space).
I was referring to McNamara’s government work, forgot about his corporate job before then. I agree there’s some SpaceX to (even pre-McDonnell Douglas merger?) Boeing axis that feels useful, but I’m not sure what to call it or what you’d do to a field (like US defence) to perpetuate the SpaceX end of it, especially over events like handovers from Kelly Johnson to the next generation.
That most developed countries, and therefore most liberal democracies, are getting significantly worse over time at building physical things seems like a Big Problem (see e.g. here). I’m glad this topic got attention on LessWrong through this post.
The main criticism I expect could be levelled on this post is that it’s very non-theoretical. It doesn’t attempt a synthesis of the lessons or takeaways. Many quotes are presented but not analysed.
(To take one random thing that occurred to me: the last quote from Anduril puts significant blame on McNamara. From my reading of The Wizards of Armageddon, McNamara seems like a typical brilliant twentieth century hard-charging modernist technocrat. Now, he made lots of mistakes, especially in the direction of being too quantitative / simplistic in the sorts of ways that Seeing Like a State dunks on. But say the rule you follow is “appoint some hard-charging brilliant technocrat and give them lots of power”; all of McNamara, Kelly Johnson, and Leslie Groves might seem very good by this light, even though McNamara’s (claimed) effect was to destroy the Groves/Johnson type of competence in US defence. How do you pick the Johnsons and Groveses over the McNamaras? What’s the difference between the culture that appoints McNamaras and one that appoints Groveses and Johnsons? More respect for hands-down engineering? Less politics, more brute need for competence and speed due to a war? Is McNamara even the correct person to blame here? Is the type of role that McNamara was in just fundamentally different from the Groves and Johnson roles such that the rules for who does well in the latter don’t apply to the former?)
(I was also concerned about the highly-upvoted critical comment, though it seems like Jacob did address the factual mistakes pointed out there.)
However, I think the post is very good and is in fact better off as a bunch of empirical anecdotes than attempting a general theory. Many things are best learnt by just being thrown a set of case studies. Clearly, something was being done at Skunk Works that the non-SpaceX American defence industry currently does not do. Differences like this are often hard-to-articulate intangible cultural stuff, and just being temporarily immersed in stories from the effective culture is often at least as good as an abstract description of what the differences were. I also appreciated the level of empiricism where Jacob was willing to drill down to actual primary sources like the rediscovered Empire State Building logbook.
This post rings true to me because it points in the same direction as many other things I’ve read on how you cultivate ideas. I’d like more people to internalise this perspective, since I suspect that one of the bad trends in the developed world is that it keeps getting easier and easier to follow incentive gradients, get sucked into an existing memeplex that stops you from thinking your own thoughts, and minimise the risks you’re exposed to. To fight back against this, ambitious people need to have in their heads some view of how uncomfortable chasing of vague ideas without immediate reward can be the best thing you can do, as a counter-narrative to the temptation of more legible opportunities.
In addition to Paul Graham’s essay that this post quotes, some good companion pieces include Ruxandra Teslo on the scarcity and importance of intellectual courage (emphasising the courage requirement), this essay (emphasising motivation and persistence), and this essay from Dan Wang (emphasising the social pulls away from the more creative paths).
It’s striking that there are so few concrete fictional descriptions of realistic AI catastrophe, despite the large amount of fiction in the LessWrong canon. The few exceptions, like Gwern’s here or Gabe’s here, are about fast take-offs and direct takeover.
I think this is a shame. The concreteness and specificity of fiction make it great for imagining futures, and its emotional pull can help us make sense of the very strange world we seem to be heading towards. And slower catastrophes, like Christiano’s What failure looks like, are a large fraction of a lot of people’s p(doom), despite being less cinematic.
One thing that motivated me in writing this was that Bostrom’s phrase “a Disneyland without children” seemed incredibly poetic. On first glance it’s hard to tell a compelling or concrete story about gradual goodharting: “and lo, many actors continued to be compelled by local incentives towards collective loss of control …”—zzzzz … But imagine a technological and economic wonderland rising, but gradually disfiguring itself as it does so, until you have an edifice of limitless but perverted plenty standing crystalline against the backdrop of a grey dead world—now that is a poetic tragedy. And that’s what I tried to put on paper here.
Did it work? Unclear. On the literary level, I’ve had people tell me they liked it a lot. I’m decently happy with it, though I think I should’ve cut it down in length a bit more.
On the worldbuilding, I appreciated being questioned on the economic mechanics in the comments, and I think my exploration of this in the comments is a decent stab at what I think is a neglected set of questions about how much the current economy being fundamentally grounded in humans limits the scope of economic-goodharting catastrophes. Recently, I discovered earlier exploration of very similar questions in Scott Alexander’s 2016 “Ascended economy?”, and by Andrew Critch here. I also greatly appreciated Andrew Critch’s recent (2024) post raising very similar concerns about “extinction by industrial dehumanization”.
I continue to hope that more people work on this, and that this piece can help by concretising this class of risks in people’s minds (I think it is very hard to get people to grok a future scenario and care about it unless there is some evocative description of it!).
I’d also hope there was some way to distribute this story more broadly than just on LessWrong and my personal blog. Ted Chiang and the Arrival movie got lots of people exposed to the principle of least action—no small feat. It’s time for the perception of AI risk to break out of decades of Terminator comparisons, and move towards a basket of good fictional examples that memorably demonstrate subtle concepts.
Really like the song! Best AI generation I’ve heard so far. Though I might be biased since I’m a fan of Kipling’s poetry: I coincidentally just memorised the source poem for this a few weeks ago, and also recently named my blog after a phrase from Hymn of Breaking Strain (which was already nicely put to non-AI music as part of Secular Solstice).
I noticed you had added a few stanzas of your own:
As the Permian Era ended, we were promised a Righteous Cause,
To fight against Oppression or take back what once was ours.
But all the Support for our Troops didn’t stop us from losing the war
And the Gods of the Copybook Headings said “Be careful what you wish for.”In Scriptures old and new, we were promised the Good and the True
By heeding the Authorities and shunning the outcast few
But our bogeys and solutions were as real as goblins and elves
And the Gods of the Copybook Headings said “Learn to think for yourselves.”
Kipling’s version has a particular slant to which vices it disapproves of, so I appreciate the expansion. The second stanza is great IMO, but the first stanza sounds a bit awkward in places. I had some fun messing with it:
As the Permian Era ended, we were promised the Righteous Cause.
In the fight against Oppression, we could ignore our cherished Laws,
Till righteous rage and fury made all rational thought uncouth.
And the Gods of the Copybook Headings said “The crowd is not the truth”
The AI time estimates are wildly high IMO, across basically every category. Some parts are also clearly optional (e.g. spending 2 hours reviewing). If you know what you want to research, writing a statement can be much shorter. I have previously applied to ML PhDs in two weeks and gotten an offer. The recommendation letters are the longest and most awkward to request at such notice, but two weeks isn’t obviously insane, especially if you have a good relationship with your reference letter writers (many students do things later than is recommended, no reference letter writer in academia will be shocked by this).
If you apply in 2025 December, you would start in 2026 fall. That is a very very long time from now. I think the stupidly long application cycle is pure dysfunction from academia, but you still need to take it into account.
(Also fyi, some UK programs have deadlines in spring if you can get your own funding)
You have restored my faith in LessWrong! I was getting worried that despite 200+ karma and 20+ comments, no one had actually nitpicked the descriptions of what actually happens.
The zaps of light are diffraction limited.
In practice, if you want the atmospheric nanobots to zap stuff, you’ll need to do some complicated mirroring because you need to divert sunlight. And it’s not one contiguous mirror but lots of small ones. But I think we can still model this as basic diffraction with some circular mirror / lens.
Intensity , where is the total power of sunlight falling on the mirror disk, is the radius of the Airy disk, and is an efficiency constant I’ve thrown in (because of things like atmospheric absorption (Claude says, somewhat surprisingly, this shouldn’t be ridiculuously large), and not all the energy in the diffraction pattern being in the Airy disk (about 84% is, says Claude), etc.)
Now, , where is the diameter of the mirror configuration, is the solar irradiance. And , where is the focal length (distance from mirror to target), and the angular size of the central spot.
So we have , so the required mirror configuration radius .
Plugging in some reasonable values like m (average incoming sunlight—yes the concentration suffers a bit because it’s not all this wavelength), W/m^2 (the level of an industrial laser that can cut metal), m (lower stratosphere), W/m^2 (solar irradiance), and a conservative guess that 99% of power is wasted so , we get m (and the resulting beam is about 3mm wide).
So a few dozen metres of upper atmosphere nanobots should actually give you a pretty ridiculous concentration of power!
(I did not know this when I wrote the story; I am quite surprised the required radius is this ridiculously tiny. But I had heard of the concept of a “weather machine” like this from the book Where is my flying car?, which I’ve reviewed here, which suggests that this is possible.)
Partly because it’s hard to tell between an actual animal and a bunch of nanobots pretending to be an animal. So you can’t zap the nanobots on the ground without making the ground uninhabitable for humans.
I don’t really buy this, why is it obvious the nanobots could pretend to be an animal so well that it’s indistinguishable? Or why would targeted zaps have bad side-effects?
The “California red tape” thing implies some alignment strategy that stuck the AI to obey the law, and didn’t go too insanely wrong despite a superintelligence looking for loopholes
Yeah, successful alignment to legal compliance was established without any real justification halfway through. (How to do this is currently an open technical problem, which, alas, I did not manage to solve for my satirical short story.)
Convince humans that dyson sphere are pretty and don’t block the view?
This is a good point, especially since high levels of emotional manipulation was an established in-universe AI capability. (The issue described with the Dyson sphere was less that it itself would block the view, and more that building it would require dismantling the planets in a way that ruins the view—though now I’m realising that “if the sun on Earth is blocked, all Earthly views are gone” is a simpler reason and removes the need for building anything on the other planets at all.)
There is also no clear explanation of why someone somewhere doesn’t make a non-red-taped AI.
Yep, this is a plot hole.
Important other types of capital, as the term is used here, include:
the physical nuclear power plants
the physical nuts and bolts
data centres
military robots
Capital is not just money!
Because humans and other AIs will accept fiat currency as an input and give you valuable things as an output.
All the infra for fiat currency exists; I don’t see why the AIs would need to reinvent that, unless they’re hiding from human government oversight or breaking some capacity constraint in the financial system, in which case they can just use crypto instead.
Military robots are yet another type of capital! Note that if it were human soldiers, there would be much more human leverage in the situation, because at least some humans would need to agree to do the soldering, and presumably would get benefits for doing so, and would use the power and leverage they accrue from doing so to push broadly human goals.
Or then the recruitment company pivots to using human labour to improve AI, as actually happened with the hottest recent recruiting company! If AI is the best investment, then humans and AIs alike will spend their efforts on AI, and the economy will gradually cater more and more to AI needs over human needs. See Andrew Critch’s post here, for example. Or my story here.