Tracing the steps
Musings on the Yudkowsky-Hanson debate from 2011.
After all sorts of interesting technological things happening in some undetermined point in the feature, will we see some small nucleus that control all resources or will we see a civilisation-wide large-scale participation in these things going down? [Robin Hanson, 2011]
What is singularity? It means something different than it used to. Originally, singularity is the breakdown in Vinge’s ability to model the future, beyond the technological creation of intelligence smarter than humans. Or as IJ Good explains the intelligence explosion – “smarter minds building smarter minds”.
“The fastest form of intelligence explosion is an AI rewriting its own source code”. This is Yudkowsky’s 2011 scenario – “a brain in a box in a basement”. Intelligence explosion.[1]
A “brain in a box in a basement” starts very small, then gets better. And will outcompete the rest of the world.
Let’s just imagine the steppes or the savannah about 50,000 years ago. Someone drops a 1TB SSD with the “intelligence”. What happens? Nothing – it’s just a shiny object for whoever finds it, and traded for a piece of skin or something. (We’re fond of trading shiny things).
“The long term history of our civilisation has been a vast increase in capacity. From language, farming to industry and who knows where. Lots of innovations have happened. Lots of big stories along the line. The major story is the steady, gradual growth. Most disruptions are small. On a larger scale it’s more steady.” Hanson.
Three major events in human history has seen an OOM increase in progress – the invention of language, of farming, and the advent of industrial society. These three events are singularities, according to Hanson. Edinburgh got some advantage from being first to industrialization, but they didn’t get a huge advantage. They didn’t take over the world.
In other words – never before in human history have innovations, even the most disruptive innovation, caused a first-mover to take over the world.
“We have strong reason to believe in a strong localized intelligence”, Yudkowsky says.
Hanson thinks we’ll have a general, gradual economic increase.
Brain in a box in a basement
Yudkowsky conflates the algorithm and the architecture. First he says evolution is the algorithm that designs intelligence “over millions and millions of years.” Then “all of a sudden, there’s this new architecture [humans]”.
To use this analogy, the “new architecture” would be a different way to apply evolution. That’s not really how evolution works, it’s not an algorithm in the traditional sense (a set of instructions). Rather, evolution is the umbrella term for a bunch of necessary and actual facts, boundary conditions, and physical laws of the universe. So, to speak in terms of evolution as an algorithm isn’t really helpful. You can’t really “change the set of instructions”, it’s more like a specific outcome tied to specific boundaries.
Asking, “what if evolution ran on a different algorithm” is non-sensical (in the positivist sense) – colorless green ideas sleep furiously.
Even more so – intelligence isn’t this property of the world that actually exists. It must be viewed in relation to the environment.
Now, this analogy can be saved (if we’re motivated enough, which we’ll pretend to be). It’s more proper to view evolution as an algorithm that produces artefacts, like the human mind. The “scaffolding” is our neuro-cognitive structure, the mereological “material” are neurons. In current-day GPT paradigm, the “material” or things-that-make-up-the-thing are vectors. You do some mathematical operations, a forward-pass. You get some output. The equivalent of the human neuro-cognitive structure is the sequence of vectors and matrix operations. Given input vector x, you get output y. The human mind is more than an inference engine, but that’s one part of it.
What is the basis of power? What does it matter an AI can generate content? Power today, is diffused bio-power. It’s not in a “thing” in the system, but in the system [which is the aggregate emergent dynamic of all the things interacting]. An AI will only further strengthen the system, not coerce or subvert it.
It will change the trajectory of the system, and as this civilisation traversing the space-time of the universe, we’ll keep existing. We’ll do cooler stuff. My main fear is that we let the bad people ruin this and permanently bind ourself to Earth, our current technological state. We need a transition to the next stage.
To build on top of the earlier analogy – we need to set up a structure-within-the-structure that allows evolution to locally advance the things we care about. Evolution isn’t a global phenomena, but a local one. Our current civilisational configuration, i.e. the laws, the content, the people and their ideas, how these things are practiced, the modes of communication, the telos and goals, the subversive actors and so on, all put selection pressure on what artefacts come into being. Ultimately, we have things and natures and minds, and minds have valence, and we want more positive valence.
This essay could’ve been written by an AI. We’ve created a p-zombie. No emotion, no valence. Could we imbue it with valence? If we have a mathematical understanding, and a physical understanding, of the conditions that produce valence – can we “print it”? Realistically, we could probably print things that have capacity for valence (we already do this with procreation), but can we ensure the ergodicity of valence?
No singleton in sight
The problem, in my view, with Yudkowsky’s line of reasoning, his fear of the brain in the box in the basement, is that we can’t produce this yet. Like sure, asking – if we produce this would maybe cause it to kill humans and how do we stop that? is one interesting question. But it’s far from the most interesting question.
We need a new theory of minds, of brains, of valence, hedonicity, and mathematics of happiness and joy.
Which basement will win? Is such an uninspiring question.
I’d strongly dis-prefer a singleton. Get me right, I love Amodei. I just don’t want him to have all the power. I fear this less with the others.
Nanotechnology
“How much smarter can we build an agent?” doesn’t really make sense, the concept of intelligence is like the old philosophers speaking of God. The entire discussion revolves around a concept that doesn’t exist. It’s dogmatic. It’s ugly. It’ll lead to something beautiful when we figure out intelligence is in relation to a task or an environment. AlphaFold is better than me at folding proteins. Yudkowksy thinks “oh, we’ll have a general intelligence in a box that does everything better” but then only speaks about research, like “figuring out nano-technology”, and ordering protein synthesis companies to build the machines that turn us all into nano-goo.
Is it physically possible to turn us all into nano-goo? Yes. It’s quite unlikely though. That’s not where the world is heading. We’re heading to continued gradual progress.
What Hanson says “content is what matters”, is we need a machine that we can feed content and that scales in smartness. We found it in GPT. And nothing changed, fundamentally. Not in the Vingean sense of “oh, we can’t predict the future beyond this point”. It all seems quite predictable, and dare I say it, boring at this point.
The groups of people, startups, working on this say things like “oh, in 2025 we’ll have GPT-5 and it’ll have phd-level intelligence”. Is this the singularity, “the point of which we can’t predict”? No.
We’ll have industrial API:s – produce this thing given this API call. We’ll have a thousand thousand experiments and we’ll be bound by capital and our economy’s absorption capacity.
“A storm of recursive self-improvement” is only possible if we think a small algorithmic improvement without content creates a vastly superior intelligence. Why would one think that?
- ^
My view – 5% chance of a localized intelligence explosion. If that happens, about 20% chance of that leading to AI takeover. Given AI takeover, about 10% chance that leads to “doom”. So about 0.1% of “AI doom”.
This is my first published essay and I see some downvotes. Feel free to share any critique since it could help me better understand the downvotes.
I first bounced off at the calculation in the footnote. This is nonsensical without some extraordinarily powerful assumptions that you don’t even state, let alone argue for.
“5% chance of a localized intelligence explosion” fine, way lower than I’d put it but not out of bounds.
“If that happens, about 20% chance of that leading to AI takeover” is arguable depending upon what you mean by “intelligence explosion”. It’s plausible if you think that almost all such “explosions” produce systems only weakly more powerful than human, but again you don’t state or argue for this.
“Given AI takeover, about 10% chance that leads to “doom” seems very low also.
“So about 0.1% of “AI doom”. Wait, WTF? Did you just multiply those to get an overall chance of AI doom? Are you seriously claiming the only way to get AI doom is via the very first intelligence explosion leading to takeover and doom? How? Why?
If you were serious about this, you’d consider that localized intelligence explosion is not the only path to superintelligence. You’d consider that if one intelligence explosion can happen, then more than one can happen, and a 20% chance that any one such event leads to takeover is not the same as the overall probability of AI takeover being 20%. You’d consider that 10% chance of any given AI takeover causing doom is not the same as the overall probability of doom from AI takeover. You’d consider that superintelligent AI could cause doom even without actually taking control of the world, e.g. by faithfully giving humans the power to cause their own doom while knowing that it will result.
Also consider that in 2011 Yudkowsky was naive and optimistic. His central scenario was what can go wrong when humans actually try to contain a potential superintelligence. They limit it to a brain in a box in an isolated location, and so he worked on what can go wrong even then. The intervening thirteen years has showed that we’re not likely to even try, so that opens up the space of possible avenues to doom even further.
Most of the later conclusions you reach are also without supporting evidence or argument. Such as “That’s not where the world is heading. We’re heading to continued gradual progress.” You present this as fact, without any supporting evidence. How do you know, with perfect certainty or even actionable confidence, that this is how we are going to continue? Why should I believe this assertion?