Back in 2016, CFAR pivoted to focusing on xrisk. I think the magic phrase at the time was:
“Rationality for its own sake, for the sake of existential risk.”
I was against this move. I also had no idea how power works. I don’t know how to translate this into LW language, so I’ll just use mine: I was secret-to-me vastly more interested in being victimized at people/institutions/the world than I was in doing real things.
But the reason I was against the move is solid. I still believe in it.
I want to spell that part out a bit. Not to gripe about the past. The past makes sense to me. But because the idea still applies.
I think it’s a simple idea once it’s not cloaked in bullshit. Maybe that’s an illusion of transparency. But I’ll try to keep this simple-to-me and correct toward more detail when asked and I feel like it, rather than spelling out all the details in a way that turns out to have been unneeded.
Which is to say, this’ll be kind of punchy and under-justified.
The short version is this:
We’re already in AI takeoff. The “AI” is just running on human minds right now. Sorting out AI alignment in computers is focusing entirely on the endgame. That’s not where the causal power is.
Maybe that’s enough for you. If so, cool.
I’ll say more to gesture at the flesh of this.
What kind of thing is wokism? Or Communism? What kind of thing was Naziism in WWII? Or the flat Earth conspiracy movement? Q Anon?
If you squint a bit, you might see there’s a common type here.
In a Facebook post I argued that it’s fair to view these things as alive. Well, really, I just described them as living, which kind of is the argument. If your woo allergy keeps you from seeing that… well, good luck to you. But if you’re willing to just assume I mean something non-woo, you just might see something real there.
These hyperobject creatures are undergoing massive competitive evolution. Thanks Internet. They’re competing for resources. Literal things like land, money, political power… and most importantly, human minds.
I mean something loose here. Y’all are mostly better at details than I am. I’ll let you flesh those out rather than pretending I can do it well.
But I’m guessing you know this thing. We saw it in the pandemic, where friendships got torn apart because people got hooked by competing memes. Some “plandemic” conspiracy theorist anti-vax types, some blind belief in provably incoherent authorities, the whole anti-racism woke wave, etc.
This is people getting possessed.
And the… things… possessing them are highly optimizing for this.
To borrow a bit from fiction: It’s worth knowing that in their original vision for The Matrix, the Wachowski siblings wanted humans to be processors, not batteries. The Matrix was a way of harvesting human computing power. As I recall, they had to change it because someone argued that people wouldn’t understand their idea.
I think we’re in a scenario like this. Not so much the “in a simulation” part. (I mean, maybe. But for what I’m saying here I don’t care.) But yes with a functionally nonhuman intelligence hijacking our minds to do coordinated computations.
(And no, I’m not positing a ghost in the machine, any more than I posit a ghost in the machine of “you” when I pretend that you are an intelligent agent. If we stop pretending that intelligence is ontologically separate from the structures it’s implemented on, then the same thing that lets “superintelligent agent” mean anything at all says we already have several.)
The talk of “late-stage capitalism” points at this. The way greenwashing appears for instance is intelligently weaponized Goodhart. It’s explicitly hacking people’s signals in order to extract what the hypercreature in question wants from people (usually profit).
The way China is drifting with a social credit system and facial recognition tech in its one party system, it appears to be threatening a Shriek. Maybe I’m badly informed here. But the point is the possibility.
In the USA, we have to file income taxes every year even though we have the tech to make it a breeze. Why? “Lobbying” is right, but that describes the action. What’s the intelligence behind the action? What agent becomes your intentional opponent if you try to change this? You might point at specific villains, but they’re not really the cause. The CEO of TurboTax doesn’t stay the CEO if he doesn’t serve the hypercreature’s hunger.
I’ll let you fill in other examples.
If the whole world were unified on AI alignment being an issue, it’d just be a problem to solve.
The problem that’s upstream of this is the lack of will.
Same thing with cryonics really. Or aging.
But AI is particularly acute around here, so I’ll stick to that.
The problem is that people’s minds aren’t clear enough to look at the problem for real. Most folk can’t orient to AI risk without going nuts or numb or splitting out gibberish platitudes.
I think this is part accidental and part hypercreature-intentional.
The accidental part is like how advertisements do a kind of DDOS attack on people’s sense of inherent self-worth. There isn’t even a single egregore to point at as the cause of that. It’s just that many, many such hypercreatures benefit from the deluge of subtly negative messaging and therefore tap into it in a sort of (for them) inverse tragedy of the commons. (Victory of the commons?)
In the same way, there’s a very particular kind of stupid that (a) is pretty much independent of g factor and (b) is super beneficial for these hypercreatures as a pathway to possession.
And I say “stupid” both because it’s evocative but also because of ties to terms like “stupendous” and “stupefy”. I interpret “stupid” to mean something like “stunned”. Like the mind is numb and pliable.
It so happens that the shape of this stupid keeps people from being grounded in the physical world. Like, how do you get a bunch of trucks out of a city? How do you fix the plumbing in your house? Why six feet for social distancing? It’s easier to drift to supposed-to’s and blame minimization. A mind that does that is super programmable.
The kind of clarity that you need to de-numb and actually goddamn look at AI risk is pretty anti all this. It’s inoculation to zombiism.
So for one, that’s just hard.
But for two, once a hypercreature (of this type) notices this immunity taking hold, it’ll double down. Evolve weaponry.
That’s the “intentional” part.
This is where people — having their minds coopted for Matrix-like computation — will pour their intelligence into dismissing arguments for AI risk.
This is why we can’t get serious enough buy-in to this problem.
Which is to say, the problem isn’t a need for AI alignment research.
The problem is current hypercreature unFriendliness.
From what I’ve been able to tell, AI alignment folk for the most part are trying to look at this external thing, this AGI, and make it aligned.
I think this is doomed.
Not just because we’re out of time. That might be.
But the basic idea was already self-defeating.
Who is aligning the AGI? And to what is it aligning?
This isn’t just a cute philosophy problem.
A common result of egregoric stupefaction is identity fuckery. We get this image of ourselves in our minds, and then we look at that image and agree “Yep, that’s me.” Then we rearrange our minds so that all those survival instincts of the body get aimed at protecting the image in our minds.
How did you decide which bits are “you”? Or what can threaten “you”?
I’ll hop past the deluge of opinions and just tell you: It’s these superintelligences. They shaped your culture’s messages, probably shoved you through public school, gripped your parents to scar you in predictable ways, etc.
It’s like installing a memetic operating system.
If you don’t sort that out, then that OS will drive how you orient to AI alignment.
My guess is, it’s a fuckton easier to sort out Friendliness/alignment within a human being than it is on a computer. Because the stuff making up Friendliness is right there.
And by extension, I think it’s a whole lot easier to create/invoke/summon/discover/etc. a Friendly hypercreature than it is to solve digital AI alignment. The birth of science was an early example.
I’m pretty sure this alignment needs to happen in first person. Not third person. It’s not (just) an external puzzle, but is something you solve inside yourself.
A brief but hopefully clarifying aside:
Stephen Jenkinson argues that most people don’t know they’re going to die. Rather, they know that everyone else is going to die.
That’s what changes when someone gets a terminal diagnosis.
I mean, if I have a 100% reliable magic method for telling how you’re going to die, and I tell you “Oh, you’ll get a heart attack and that’ll be it”, that’ll probably feel weird but it won’t fill you with dread. If anything it might free you because now you know there’s only one threat to guard against.
But there’s a kind of deep, personal dread, a kind of intimate knowing, that comes when the doctor comes in with a particular weight and says “I’ve got some bad news.”
It’s immanent.
You can feel that it’s going to happen to you.
Not the idea of you. It’s not “Yeah, sure, I’m gonna die someday.”
It becomes real.
You’re going to experience it from behind the eyes reading these words.
From within the skin you’re in as you witness this screen.
When I talk about alignment being “first person and not third person”, it’s like this. How knowing your mortality doesn’t happen until it happens in first person.
Any kind of “alignment” or “Friendliness” or whatever that doesn’t put that first person ness at the absolute very center isn’t a thing worth crowing about.
I think that’s the core mistake anyway. Why we’re in this predicament, why we have unaligned superintelligences ruling the world, and why AGI looks so scary.
It’s in forgetting the center of what really matters.
It’s worth noting that the only scale that matters anymore is the hypercreature one.
I mean, one of the biggest things a single person can build on their own is a house. But that’s hard, and most people can’t do that. Mostly companies build houses.
Solving AI alignment is fundamentally a coordination problem. The kind of math/programming/etc. needed to solve it is literally superhuman, the way the four color theorem was (and still kind of is) superhuman.
“Attempted solutions to coordination problems” is a fine proto-definition of the hypercreatures I’m talking about.
So if the creatures you summon to solve AI alignment aren’t Friendly, you’re going to have a bad time.
And for exactly the same reason that most AGIs aren’t Friendly, most emergent egregores aren’t either.
As individuals, we seem to have some glimmer of ability to lean toward resonance with one hypercreature or another. Even just choosing what info diet you’re on can do this. (Although there’s an awful lot of magic in that “just choosing” part.)
But that’s about it.
We can’t align AGI. That’s too big.
It’s too big the way the pandemic was too big, and the Ukraine/Putin war is too big, and wokeism is too big.
When individuals try to act on the “god” scale, they usually just get possessed. That’s the stupid simple way of solving coordination problems.
So when you try to contribute to solving AI alignment, what egregore are you feeding?
If you don’t know, it’s probably an unFriendly one.
(Also, don’t believe your thoughts too much. Where did they come from?)
It’s like we’ve got gods warring, and they’re threatening to step into digital form to accelerate their war.
We’re freaking out about their potential mech suits.
But the problem is the all-out war, not the weapons.
We have an advantage in that this war happens on and through us. So if we take responsibility for this, we can influence the terrain and bias egregoric/memetic evolution to favor Friendliness.
Anything else is playing at the wrong level. Not our job. Can’t be our job. Not as individuals, and it’s individuals who seem to have something mimicking free will.
Sorting that out in practice seems like the only thing worth doing.
Not “solving xrisk”. We can’t do that. Too big. That’s worth modeling, since the gods need our minds in order to think and understand things. But attaching desperation and a sense of “I must act!” to it is insanity. Food for the wrong gods.
Ergo why I support rationality for its own sake, period.
That, at least, seems to target a level at which we mere humans can act.
We’re already in AI takeoff
Back in 2016, CFAR pivoted to focusing on xrisk. I think the magic phrase at the time was:
I was against this move. I also had no idea how power works. I don’t know how to translate this into LW language, so I’ll just use mine: I was secret-to-me vastly more interested in being victimized at people/institutions/the world than I was in doing real things.
But the reason I was against the move is solid. I still believe in it.
I want to spell that part out a bit. Not to gripe about the past. The past makes sense to me. But because the idea still applies.
I think it’s a simple idea once it’s not cloaked in bullshit. Maybe that’s an illusion of transparency. But I’ll try to keep this simple-to-me and correct toward more detail when asked and I feel like it, rather than spelling out all the details in a way that turns out to have been unneeded.
Which is to say, this’ll be kind of punchy and under-justified.
The short version is this:
We’re already in AI takeoff. The “AI” is just running on human minds right now. Sorting out AI alignment in computers is focusing entirely on the endgame. That’s not where the causal power is.
Maybe that’s enough for you. If so, cool.
I’ll say more to gesture at the flesh of this.
What kind of thing is wokism? Or Communism? What kind of thing was Naziism in WWII? Or the flat Earth conspiracy movement? Q Anon?
If you squint a bit, you might see there’s a common type here.
In a Facebook post I argued that it’s fair to view these things as alive. Well, really, I just described them as living, which kind of is the argument. If your woo allergy keeps you from seeing that… well, good luck to you. But if you’re willing to just assume I mean something non-woo, you just might see something real there.
These hyperobject creatures are undergoing massive competitive evolution. Thanks Internet. They’re competing for resources. Literal things like land, money, political power… and most importantly, human minds.
I mean something loose here. Y’all are mostly better at details than I am. I’ll let you flesh those out rather than pretending I can do it well.
But I’m guessing you know this thing. We saw it in the pandemic, where friendships got torn apart because people got hooked by competing memes. Some “plandemic” conspiracy theorist anti-vax types, some blind belief in provably incoherent authorities, the whole anti-racism woke wave, etc.
This is people getting possessed.
And the… things… possessing them are highly optimizing for this.
To borrow a bit from fiction: It’s worth knowing that in their original vision for The Matrix, the Wachowski siblings wanted humans to be processors, not batteries. The Matrix was a way of harvesting human computing power. As I recall, they had to change it because someone argued that people wouldn’t understand their idea.
I think we’re in a scenario like this. Not so much the “in a simulation” part. (I mean, maybe. But for what I’m saying here I don’t care.) But yes with a functionally nonhuman intelligence hijacking our minds to do coordinated computations.
(And no, I’m not positing a ghost in the machine, any more than I posit a ghost in the machine of “you” when I pretend that you are an intelligent agent. If we stop pretending that intelligence is ontologically separate from the structures it’s implemented on, then the same thing that lets “superintelligent agent” mean anything at all says we already have several.)
We’re already witnessing orthogonality.
The talk of “late-stage capitalism” points at this. The way greenwashing appears for instance is intelligently weaponized Goodhart. It’s explicitly hacking people’s signals in order to extract what the hypercreature in question wants from people (usually profit).
The way China is drifting with a social credit system and facial recognition tech in its one party system, it appears to be threatening a Shriek. Maybe I’m badly informed here. But the point is the possibility.
In the USA, we have to file income taxes every year even though we have the tech to make it a breeze. Why? “Lobbying” is right, but that describes the action. What’s the intelligence behind the action? What agent becomes your intentional opponent if you try to change this? You might point at specific villains, but they’re not really the cause. The CEO of TurboTax doesn’t stay the CEO if he doesn’t serve the hypercreature’s hunger.
I’ll let you fill in other examples.
If the whole world were unified on AI alignment being an issue, it’d just be a problem to solve.
The problem that’s upstream of this is the lack of will.
Same thing with cryonics really. Or aging.
But AI is particularly acute around here, so I’ll stick to that.
The problem is that people’s minds aren’t clear enough to look at the problem for real. Most folk can’t orient to AI risk without going nuts or numb or splitting out gibberish platitudes.
I think this is part accidental and part hypercreature-intentional.
The accidental part is like how advertisements do a kind of DDOS attack on people’s sense of inherent self-worth. There isn’t even a single egregore to point at as the cause of that. It’s just that many, many such hypercreatures benefit from the deluge of subtly negative messaging and therefore tap into it in a sort of (for them) inverse tragedy of the commons. (Victory of the commons?)
In the same way, there’s a very particular kind of stupid that (a) is pretty much independent of g factor and (b) is super beneficial for these hypercreatures as a pathway to possession.
And I say “stupid” both because it’s evocative but also because of ties to terms like “stupendous” and “stupefy”. I interpret “stupid” to mean something like “stunned”. Like the mind is numb and pliable.
It so happens that the shape of this stupid keeps people from being grounded in the physical world. Like, how do you get a bunch of trucks out of a city? How do you fix the plumbing in your house? Why six feet for social distancing? It’s easier to drift to supposed-to’s and blame minimization. A mind that does that is super programmable.
The kind of clarity that you need to de-numb and actually goddamn look at AI risk is pretty anti all this. It’s inoculation to zombiism.
So for one, that’s just hard.
But for two, once a hypercreature (of this type) notices this immunity taking hold, it’ll double down. Evolve weaponry.
That’s the “intentional” part.
This is where people — having their minds coopted for Matrix-like computation — will pour their intelligence into dismissing arguments for AI risk.
This is why we can’t get serious enough buy-in to this problem.
Which is to say, the problem isn’t a need for AI alignment research.
The problem is current hypercreature unFriendliness.
From what I’ve been able to tell, AI alignment folk for the most part are trying to look at this external thing, this AGI, and make it aligned.
I think this is doomed.
Not just because we’re out of time. That might be.
But the basic idea was already self-defeating.
Who is aligning the AGI? And to what is it aligning?
This isn’t just a cute philosophy problem.
A common result of egregoric stupefaction is identity fuckery. We get this image of ourselves in our minds, and then we look at that image and agree “Yep, that’s me.” Then we rearrange our minds so that all those survival instincts of the body get aimed at protecting the image in our minds.
How did you decide which bits are “you”? Or what can threaten “you”?
I’ll hop past the deluge of opinions and just tell you: It’s these superintelligences. They shaped your culture’s messages, probably shoved you through public school, gripped your parents to scar you in predictable ways, etc.
It’s like installing a memetic operating system.
If you don’t sort that out, then that OS will drive how you orient to AI alignment.
My guess is, it’s a fuckton easier to sort out Friendliness/alignment within a human being than it is on a computer. Because the stuff making up Friendliness is right there.
And by extension, I think it’s a whole lot easier to create/invoke/summon/discover/etc. a Friendly hypercreature than it is to solve digital AI alignment. The birth of science was an early example.
I’m pretty sure this alignment needs to happen in first person. Not third person. It’s not (just) an external puzzle, but is something you solve inside yourself.
A brief but hopefully clarifying aside:
Stephen Jenkinson argues that most people don’t know they’re going to die. Rather, they know that everyone else is going to die.
That’s what changes when someone gets a terminal diagnosis.
I mean, if I have a 100% reliable magic method for telling how you’re going to die, and I tell you “Oh, you’ll get a heart attack and that’ll be it”, that’ll probably feel weird but it won’t fill you with dread. If anything it might free you because now you know there’s only one threat to guard against.
But there’s a kind of deep, personal dread, a kind of intimate knowing, that comes when the doctor comes in with a particular weight and says “I’ve got some bad news.”
It’s immanent.
You can feel that it’s going to happen to you.
Not the idea of you. It’s not “Yeah, sure, I’m gonna die someday.”
It becomes real.
You’re going to experience it from behind the eyes reading these words.
From within the skin you’re in as you witness this screen.
When I talk about alignment being “first person and not third person”, it’s like this. How knowing your mortality doesn’t happen until it happens in first person.
Any kind of “alignment” or “Friendliness” or whatever that doesn’t put that first person ness at the absolute very center isn’t a thing worth crowing about.
I think that’s the core mistake anyway. Why we’re in this predicament, why we have unaligned superintelligences ruling the world, and why AGI looks so scary.
It’s in forgetting the center of what really matters.
It’s worth noting that the only scale that matters anymore is the hypercreature one.
I mean, one of the biggest things a single person can build on their own is a house. But that’s hard, and most people can’t do that. Mostly companies build houses.
Solving AI alignment is fundamentally a coordination problem. The kind of math/programming/etc. needed to solve it is literally superhuman, the way the four color theorem was (and still kind of is) superhuman.
“Attempted solutions to coordination problems” is a fine proto-definition of the hypercreatures I’m talking about.
So if the creatures you summon to solve AI alignment aren’t Friendly, you’re going to have a bad time.
And for exactly the same reason that most AGIs aren’t Friendly, most emergent egregores aren’t either.
As individuals, we seem to have some glimmer of ability to lean toward resonance with one hypercreature or another. Even just choosing what info diet you’re on can do this. (Although there’s an awful lot of magic in that “just choosing” part.)
But that’s about it.
We can’t align AGI. That’s too big.
It’s too big the way the pandemic was too big, and the Ukraine/Putin war is too big, and wokeism is too big.
When individuals try to act on the “god” scale, they usually just get possessed. That’s the stupid simple way of solving coordination problems.
So when you try to contribute to solving AI alignment, what egregore are you feeding?
If you don’t know, it’s probably an unFriendly one.
(Also, don’t believe your thoughts too much. Where did they come from?)
So, I think raising the sanity waterline is upstream of AI alignment.
It’s like we’ve got gods warring, and they’re threatening to step into digital form to accelerate their war.
We’re freaking out about their potential mech suits.
But the problem is the all-out war, not the weapons.
We have an advantage in that this war happens on and through us. So if we take responsibility for this, we can influence the terrain and bias egregoric/memetic evolution to favor Friendliness.
Anything else is playing at the wrong level. Not our job. Can’t be our job. Not as individuals, and it’s individuals who seem to have something mimicking free will.
Sorting that out in practice seems like the only thing worth doing.
Not “solving xrisk”. We can’t do that. Too big. That’s worth modeling, since the gods need our minds in order to think and understand things. But attaching desperation and a sense of “I must act!” to it is insanity. Food for the wrong gods.
Ergo why I support rationality for its own sake, period.
That, at least, seems to target a level at which we mere humans can act.