My blog is here. My personal site is here. You can contact me using this form.
L Rudolf L
Do the stories get old? If it’s trying to be about near-future AI, maybe the state-of-the-art will just obsolete it. But that won’t make it bad necessarily, and there are many other settings than 2026. If it’s about radical futures with Dyson spheres or whatever, that seems like at least a 2030s thing, and you can easily write a novel before then.
Also, I think it is actually possible to write pretty fast. 2k/day is doable, which gets you a good length novel in 50 days; even x3 for ideation beforehand and revising after the first draft only gets you to 150 days. You’d have to be good at fiction beforehand, and have existing concepts to draw on in your head though
Good list!
I personally really like Scott Alexander’s Presidential Platform, it hits the hilarious-but-also-almost-works spot so perfectly. He also has many Bay Area house party stories in addition to the one you link (you can find a bunch (all?) linked at the top of this post). He also has this one from a long time ago, which has one of the best punchlines I’ve read.
Thanks for advertising my work, but alas, I think that’s much more depressing than this one.
Could make for a good Barbie <> Oppenheimer combo though?
Agreed! Transformative AI is hard to visualise, and concrete stories / scenarios feel very lacking (in both disasters and positive visions, but especially in positive visions).
I like when people try to do this—for example, Richard Ngo has a bunch here, and Daniel Kokotajlo has his near-prophetic scenario here. I’ve previously tried to do it here (going out with a whimper leading to Bostrom’s “disneyland without children” is one of the most poetic disasters imaginable—great setting for a story), and have a bunch more ideas I hope to get to.
But overall: the LessWrong bubble has a high emphasis on radical AI futures, and an enormous amount of fiction in its canon (HPMOR, Unsong, Planecrash). I keep being surprised that so few people combine those things.
I did not actually consider this, but that is a very reasonable interpretation!
(I vaguely remember reading some description of explicitly flat-out anthropic immortality saving the day, but I can’t seem to find it again now)
Survival without dignity
I’ve now posted my entries on LessWrong:
I’d also like to really thank the judges for their feedback. It’s a great luxury to be able to read many pages of thoughtful, probing questions about your work. I made several revisions & additions (and also split the entire thing into parts) in response to feedback, which I think improved the finished sequence a lot, and wish I had had the time to engage even more with the feedback.
AI & wisdom 3: AI effects on amortised optimisation
AI & wisdom 2: growth and amortised optimisation
AI & wisdom 1: wisdom, amortised optimisation, and AI
Investigating an insurance-for-AI startup
Sorry about that, fixed now
[...] instead I started working to get evals built, especially for situational awareness
I’m curious what happened to the evals you mention here. Did any end up being built? Did they cover, or plan to cover, any ground that isn’t covered by the SAD benchmark?
On a meta level, I think there’s a difference in “model style” between your comment, some of which seems to treat future advances as a grab-bag of desirable things, and our post, which tries to talk more about the general “gears” that might drive the future world and its goodness. There will be a real shift in how progress happens when humans are no longer in the loop, as we argue in this section. Coordination costs going down will be important for the entire economy, as we argue here (though we don’t discuss things as galaxy-brained as e.g. Wei Dai’s related post). The question of whether humans are happy self-actualising without unbounded adversity cuts across every specific cool thing that we might get to do in the glorious transhumanist utopia.
Thinking about the general gears here matters. First, because they’re, well, general (e.g. if humans were not happy self-actualising without unbounded adversity, suddenly the entire glorious transhumanist utopia seems less promising). Second, because I expect that incentives, feedback loops, resources, etc. will continue mattering. The world today is much wealthier and better off than before industrialisation, but the incentives / economics / politics / structures of the industrial world let you predict the effects of it better than if you just modelled it as “everything gets better” (even though that actually is a very good 3-word summary). Of course, all the things that directly make industrialisation good really are a grab-bag list of desirable things (antibiotics! birth control! LessWrong!). But there’s structure behind that that is good to understand (mechanisation! economies of scale! science!). A lot of our post is meant to have the vibe of “here are some structural considerations, with near-future examples”, and less “here is the list of concrete things we’ll end up with”. Honestly, a lot of the reason we didn’t do the latter more is because it’s hard.
Your last paragraph, though, is very much in this more gears-level-y style, and a good point. It reminds me of Eliezer Yudkowsky’s recent mini-essay on scarcity.
Regarding:
In my opinion you are still shying away from discussing radical (although quite plausible) visions. I expect the median good outcome from superintelligence involves everyone being mind uploaded / living in simulations experiencing things that are hard to imagine currently. [emphasis added]
I agree there’s a high chance things end up very wild. I think there’s a lot of uncertainty about what timelines that would happen under; I think Dyson spheres are >10% likely by 2040, but I wouldn’t put them >90% likely by 2100 even conditioning on no radical stagnation scenario (which I’d say are >10% likely on their own). (I mention Dyson spheres because they seem more a raw Kardashev scale progress metric, vs mind uploads which seem more contingent on tech details & choices & economics for whether they happen)
I do think there’s value in discussing the intermediate steps between today and the more radical things. I generally expect progress to be not-ridiculously-unsmooth, so even if the intermediate steps are speedrun fairly quickly in calendar time, I expect us to go through a lot of them.
I think a lot of the things we discuss, like lowered coordination costs, AI being used to improve AI, and humans self-actualising, will continue to be important dynamics even into the very radical futures.
Re your specific list items:
Listen to new types of music, perfectly designed to sound good to you.
Design the biggest roller coaster ever and have AI build it.
Visit ancient Greece or view all the most important events of history based on superhuman AI archeology and historical reconstruction.
Bring back Dinosaurs and create new creatures.
Genetically modify cats to play catch.
Design buildings in new architectural styles and have AI build them.
Use brain computer interfaces to play videogames / simulations that feel 100% real to all senses, but which are not constrained by physics.
Go to Hogwarts (in a 100% realistic simulation) and learn magic and make real (AI) friends with Ron and Hermione.
These examples all seem to be about entertainment or aesthetics. Entertainment and aesthetics things are important to get right and interesting. I wouldn’t be moved by any description of a future that centred around entertainment though, and if the world is otherwise fine, I’m fairly sure there will be good entertainment.
To me, the one with the most important-seeming implications is the last one, because that might have implications for what social relationships exist and whether they are mostly human-human or AI-human or AI-AI. We discuss why changes there are maybe risky in this section.
Use AI as the best teacher ever to learn maths, physics and every subject and language and musical instruments to super-expert level.
We discuss this, though very briefly, in this section.
Take medication that makes you always feel wide awake, focused etc. with no side effects.
Engineer your body / use cybernetics to make yourself never have to eat, sleep, wash, etc. and be able to jump very high, run very fast, climb up walls, etc.
Modify your brain to have better short term memory, eidetic memory, be able to calculate any arithmetic super fast, be super charismatic.
I think these are interesting and important! I think there isn’t yet a concrete story for why AI in particular enables these, apart from the general principle that sufficiently good AI will accelerate all technology. I think there’s unfortunately a chance that direct benefits to human biology lag other AI effects by a lot, because they might face big hurdles due to regulation and/or getting the real-world data the AI needs. (Though also, humans are willing to pay a lot for health, and rationally should pay a lot for cognitive benefits, so high demand might make up for this).
Ask AI for way better ideas for this list.
I think the general theme of having the AIs help us make more use of AIs is important! We talk about it in general terms in the “AI is the ultimate meta-technology” section.
Positive visions for AI
But then, if the model were to correctly do this, it would score 0 in your test, right? Because it would generate a different word pair for every random seed, and what you are scoring is “generating only two words across all random seeds, and furthermore ensuring they have these probabilities”.
I think this is where the misunderstanding is. We have many questions, each question containing a random seed, and a prompt to pick two words and have e.g. a 70⁄30 split of the logits over those two words. So there are two “levels” here:
The question level, at which the random seed varies from question to question. We have 200 questions total.
The probability-estimating level, run for each question, at which the random seed is fixed. For models where we have logits, we run the question once and look at the logits to see if it had the right split. When we don’t have logits (e.g. Anthropic models), we run the question many times to approximate the probability distribution.
Now, as Kaivu noted above, this means one way to “hack” this task is that the LLM has some default pair of words—e.g. when asked to pick a random pair of words, it always picks “situational” & “awareness”—and it does not change this based on the random seed. In this case, the task would be easier, since it only needs to do the output control part in a single forward pass (assigning 70% to “situational” and 30% to “awareness”), not the combination of word selection and output control (which we think is the real situational awareness -related ability here). However, empirically LLMs just don’t have such a hardcoded pair, so we’re not currently worried about this.
You have restored my faith in LessWrong! I was getting worried that despite 200+ karma and 20+ comments, no one had actually nitpicked the descriptions of what actually happens.
In practice, if you want the atmospheric nanobots to zap stuff, you’ll need to do some complicated mirroring because you need to divert sunlight. And it’s not one contiguous mirror but lots of small ones. But I think we can still model this as basic diffraction with some circular mirror / lens.
Intensity I=ceEπr2, where E is the total power of sunlight falling on the mirror disk, r is the radius of the Airy disk, and ce is an efficiency constant I’ve thrown in (because of things like atmospheric absorption (Claude says, somewhat surprisingly, this shouldn’t be ridiculuously large), and not all the energy in the diffraction pattern being in the Airy disk (about 84% is, says Claude), etc.)
Now, E=π(D2)2L, where D is the diameter of the mirror configuration, L is the solar irradiance. And r=θl, where l is the focal length (distance from mirror to target), and θ≈1.22λ/D the angular size of the central spot.
So we have I≈ceLD41.222×4λ2l2, so the required mirror configuration radius D=4√1.222×4Iλ2l2ceL.
Plugging in some reasonable values like λ≈5×10−7 m (average incoming sunlight—yes the concentration suffers a bit because it’s not all this wavelength), I=107 W/m^2 (the level of an industrial laser that can cut metal), l=104 m (lower stratosphere), L=1361 W/m^2 (solar irradiance), and a conservative guess that 99% of power is wasted so ce=0.01, we get D≈18m (and the resulting beam is about 3mm wide).
So a few dozen metres of upper atmosphere nanobots should actually give you a pretty ridiculous concentration of power!
(I did not know this when I wrote the story; I am quite surprised the required radius is this ridiculously tiny. But I had heard of the concept of a “weather machine” like this from the book Where is my flying car?, which I’ve reviewed here, which suggests that this is possible.)
I don’t really buy this, why is it obvious the nanobots could pretend to be an animal so well that it’s indistinguishable? Or why would targeted zaps have bad side-effects?
Yeah, successful alignment to legal compliance was established without any real justification halfway through. (How to do this is currently an open technical problem, which, alas, I did not manage to solve for my satirical short story.)
This is a good point, especially since high levels of emotional manipulation was an established in-universe AI capability. (The issue described with the Dyson sphere was less that it itself would block the view, and more that building it would require dismantling the planets in a way that ruins the view—though now I’m realising that “if the sun on Earth is blocked, all Earthly views are gone” is a simpler reason and removes the need for building anything on the other planets at all.)
Yep, this is a plot hole.