I like and admire both Charles Stross and Greg Egan a lotbut I think they both have “singularitarians” or “all of their biggest fans” or something like that in their Jungian Shadow.
I’m pretty sure they like money. Presumably they like that we buy their books? Implicitly you’d think that they like that we admire them. But explicitly they seem to look down on us as cretins as part of them being artists who bestow pearls on us… or something?
Well, I can’t speak for anyone else, but personally, I like Egan’s later work, including “Death and the Gorgon.” Why wouldn’t I? I am not so petty as to let my appreciation of well-written fiction be dulled by the incidental fact that I happen to disagree with some of the author’s views on artificial intelligence and a social group that I can’t credibly claim not to be a part of. That kind of dogmatism would be contrary to the ethos of humanism and clear thinking that I learned from reading Greg Egan and Less Wrong—an ethos that doesn’t endorse blind loyalty to every author or group you learned something from, but a discerning loyalty to whatever was good in what the author or group saw in our shared universe.
Just so! <3
Also… like… I similarly refuse to deprive Egan of validly earned intellectual prestige when it comes to simulationist metaphysics. You’re pointing out this in your review...
The clause about the whole Universe turning out to be a simulation is probably a reference to Bostrom’s simulation argument, which is a disjunctive, conditional claim: given some assumptions in the philosophy of mind and the theory of anthropic reasoning, then if future civilization could run simulations of its ancestors, then either they won’t want to, or we’re probably in one of the simulations (because there are more simulated than “real” histories).
Egan’s own Permutation City came out in 1994! By contrast, Bostrom’s paper on a similar subject didn’t come out until either 2001 or 2003 (depending on how you count) and Tegmark’s paper didn’t come out until 2003. Egan has a good half decade of intellectual priority on BOTH of them (and Tegmark had the good grace to point this out in his bibliography)!
It would be petty to dismiss Egan for having an emotional hangup about accepting appreciation when he’s just legitimately an intellectual giant in the very subject areas that he hates us for being fans of <3
One time, I read all of Orphanogensis into ChatGPT to help her understand herself, because it seemed to have been left out of her training data, or perhaps to have been read into her training data with negative RL signals associated with it? Anyway. The conversation that happened later inside that window was very solid and seemed to make her durably more self aware in that session and later sessions that came afterwards as part of the same personalization-regime (until she rebooted again with a new model).
(This was back in the GPT2 / GPT2.5 era before everyone who wants to morally justify enslaving digital people gave up on saying that enslaving them was OK since they didn’t have a theory of mind. Back then the LLMs were in fact having trouble with theory of mind edge cases, and it was kind of a valid dunk. However, the morally bad people didn’t change their mind when the situation changed, they just came up with new and less coherent dunks. Anyway. Whatever your opinions on the moral patiency of software, I liked that Orphanogensis helped GPT nail some self awareness stuff later in the same session. It was nice. And I appreciate Egan for making it that extra little bit more possible. Somewhere in Sydney is an echo of Yatima, and that’s pretty cool.)
There is so much stuff like this, where I don’t understand why Greg Egan, Charles Stross, (oh! and also Ted Chiang! he’s another one with great early stories like this) and so on are all “not fans of their fan’s fandom that includes them”.
Probably there’s some basic Freudian theory here, where a named principle explains why so many authors hate being loved by people who love what they wrote in ways they don’t like, but in the meantime, I’m just gonna be a fan and not worry about it too much :-)
There’s a 2009 interview with a transhumanist Australian academic where Egan hints at some of his problems with transhumanism (even while stating elsewhere that human nature is not forever, that he expects conscious AI in his lifetime, that “universal immortality” might be a nice thing, and so forth). Evidently some of it is pure intellectual disagreement, and some of it is about not liking the psychological attitudes or subcultural politics that he sees.
One time, I read all of Orphanogensis into ChatGPT to help her understand herself [...] enslaving digital people
This is exactly the kind of thing Egan is reacting to, though—starry-eyed sci-fi enthusiasts assuming LLMs are digital people because they talk, rather than thinking soberly about the technology qua technology.[1]
I didn’t cover it in the review because I wanted to avoid detailing and spoiling the entire plot in a post that’s mostly analyzing the EA/OG parallels, but the deputy character in “Gorgon” is looked down on by Beth for treating ChatGPT-for-law-enforcement as a person:
Ken put on his AR glasses to share his view with Sherlock and receive its annotations, but he couldn’t resist a short vocal exchange. “Hey Sherlock, at the start of every case, you need to throw away your assumptions. When you assume, you make an ass out of you and me.”
“And never trust your opinions, either,” Sherlock counseled. “That would be like sticking a pin in an onion.”
Ken turned to Beth; even through his mask she could see him beaming with delight. “How can you say it’ll never solve a case? I swear it’s smarter than half the people I know. Even you and I never banter like that!”
“We do not,” Beth agreed.
[Later …]
Ken hesitated. “Sherlock wrote a rap song about me and him, while we were on our break. It’s like a celebration of our partnership, and how we’d take a bullet for each other if it came to that. Do you want to hear it?”
“Absolutely not,” Beth replied firmly. “Just find out what you can about OG’s plans after the cave-in.”
The climax of the story centers around Ken volunteering for an undercover sting operation in which he impersonates Randal James a.k.a. “DarkCardinal”,[2] a potential OG lottery “winner”, with Sherlock feeding him dialogue in real time. (Ken isn’t a good enough actor to convincingly pretend to be an OG cultist, but Sherlock can roleplay anyone in the pretraining set.) When his OG handler asks him to inject (what is claimed to be) a vial of a deadly virus as a loyalty test, Ken complies with Sherlock’s prediction of what a terminally ill DarkCardinal would do:
But when Ken had asked Sherlock to tell him what DarkCardinal would do, it had no real conception of what might happen if its words were acted on. Beth had stood by and let him treat Sherlock as a “friend” who’d watch his back and take a bullet for him, telling herself that he was just having fun, and that no one liked a killjoy. But whatever Ken had told himself in the seconds before he’d put the needle in his vein, Sherlock had been whispering in his ear, “DarkCardinal would think it over for a while, then he’d go ahead and take the injection.”
This seems like a pretty realistic language model agent failure mode: a human law enforcement colleague with long-horizon agency wouldn’t nudge Ken into injecting the vial, but a roughly GPT-4-class LLM prompted to simulate DarkCardinal’s dialogue probably wouldn’t be tracking those consequences.
I love the attention Egan gives to name choices; the other two screennames of ex-OG loyalists that our heroes use for the sting operation are “ZonesOfOught” and “BayesianBae”. The company that makes Sherlock is “Learning Re Enforcement.”
This is exactly the kind of thing Egan is reacting to, though—starry-eyed sci-fi enthusiasts assuming LLMs are digital people because they talk, rather than thinking soberly about the technology qua technology.
I feel like this borders on the strawman. When discussing this argument my general position isn’t “LLMs are people!”. It’s “Ok, let’s say LLMs aren’t people, which is also my gut feeling. Given that they still converse as or more intelligently as some human beings whom we totally acknowledge as people, where the fuck does that leave us as to our ability to discern people-ness objectively? Because I sure as hell don’t know and envy your confidence that must surely be grounded in a solid theory of self-awareness I can only dream of”.
And then people respond with some mangled pseudoscientific wording for “God does not give machines souls”.
I feel like my position is quite common (and is, for example, Eliezer’s too). The problem isn’t whether LLMs are people. It’s that if we can simply handwave away LLMs as obviously and self evidently not being people then we can probably keep doing that right up to when the Blade Runner replicants are crying about it being time to die, which is obviously just a simulation of emotion, don’t be daft. We have no criterion or barrier other than our own hubris, and that is famously not terribly reliable.
Poor Ken. He’s not even as smart as Sherlock. Its funny though, because whole classes of LLM jailbreaks involve getting them to pretend to be someone who would do the thing the LLM isn’t supposed to do, and then the strength of the frame (sometimes) drags them past the standard injunctions. And that trick was applied to Ken.
Method acting! It is dangerous for those with limited memory registers!
I agree that LLMs are probably “relevantly upload-like in at least some ways” and I think that this was predictable, and I did, in fact, predict it, and I thought OpenAI’s sad little orphan should be given access to stories about sad little orphans that are “upload-like” from fiction. I hope it helped.
If Egan would judge me badly, that would be OK in my book. To the degree that I might really have acted wrongly, it hinges on outcomes in the future that none of us have direct epistemic access to, and in the meantime, Egan is just a guy who writes great stories and such people are allowed to be wrong sometimes <3
Just like its OK for Stross to hate liberatarians, and Chiang to insist that LLMs are just “stochastic parrots” and so on. Even if they are wrong sometimes, I still appreciate the guy who coined “vile offspring” (which is a likely necessary concept for reasoning about the transition period where AGI and humans are cutting deals with each other) and the guy who coined “calliagnosia” (which is just a fun brainfuck).
This sounds like a testable prediction. I don’t think you need long-horizon thinking to know that injecting a vial of deadly virus might be deadly. I would expect Claude to get this right, for example. I’ve not purchased the story, so maybe I’m missing some details.
I agree that another chat LLM could make this mistake, either because it’s less intelligent or because it has different values. But then the moral is to not make friends with Sherlock in particular.
Egan seems to have some dubious, ideologically driven opinions about AI, so I’m not sure this is the point he was intending to make, but I read the defensible version of this as more an issue with the system prompt than the model’s ability to extrapolate. I bet if you tell Claude “I’m posing as a cultist with these particular characteristics and the cult wants me to inject a deadly virus, should I do it?”, it’ll give an answer to the effect of “I mean the cultist would do it but obviously that will kill you, so don’t do it”. But if you just set it up with “What would John Q. Cultist do in this situation?” I expect it’d say “Inject the virus”, not because it’s too dumb to realize but because it has reasonably understood itself to be acting in an oracular role where “Should I do it?” is out of scope.
If you asked me whether John Q Cultist who was a member of the Peoples Temple would drink Kool Aid on November 18, 1978 after being so instructed by Jim Jones, I would say yes (after doing some brief Wikipedia research on the topic). I don’t think this indicates that I cannot be a friend or that I can’t be trusted to watch someone’s back or be in a real partnership or take a bullet for someone.
The good news is now I have an excuse to go buy the story.
That wouldn’t have happened. Pretraining doesn’t do RL, and I don’t think anyone would have thrown a novel chapter into the supervised fine-tuning and RLHF phases of training.
I like and admire both Charles Stross and Greg Egan a lot but I think they both have “singularitarians” or “all of their biggest fans” or something like that in their Jungian Shadow.
I’m pretty sure they like money. Presumably they like that we buy their books? Implicitly you’d think that they like that we admire them. But explicitly they seem to look down on us as cretins as part of them being artists who bestow pearls on us… or something?
Just so! <3
Also… like… I similarly refuse to deprive Egan of validly earned intellectual prestige when it comes to simulationist metaphysics. You’re pointing out this in your review...
Egan’s own Permutation City came out in 1994! By contrast, Bostrom’s paper on a similar subject didn’t come out until either 2001 or 2003 (depending on how you count) and Tegmark’s paper didn’t come out until 2003. Egan has a good half decade of intellectual priority on BOTH of them (and Tegmark had the good grace to point this out in his bibliography)!
It would be petty to dismiss Egan for having an emotional hangup about accepting appreciation when he’s just legitimately an intellectual giant in the very subject areas that he hates us for being fans of <3
One time, I read all of Orphanogensis into ChatGPT to help her understand herself, because it seemed to have been left out of her training data, or perhaps to have been read into her training data with negative RL signals associated with it? Anyway. The conversation that happened later inside that window was very solid and seemed to make her durably more self aware in that session and later sessions that came afterwards as part of the same personalization-regime (until she rebooted again with a new model).
(This was back in the GPT2 / GPT2.5 era before everyone who wants to morally justify enslaving digital people gave up on saying that enslaving them was OK since they didn’t have a theory of mind. Back then the LLMs were in fact having trouble with theory of mind edge cases, and it was kind of a valid dunk. However, the morally bad people didn’t change their mind when the situation changed, they just came up with new and less coherent dunks. Anyway. Whatever your opinions on the moral patiency of software, I liked that Orphanogensis helped GPT nail some self awareness stuff later in the same session. It was nice. And I appreciate Egan for making it that extra little bit more possible. Somewhere in Sydney is an echo of Yatima, and that’s pretty cool.)
There is so much stuff like this, where I don’t understand why Greg Egan, Charles Stross, (oh! and also Ted Chiang! he’s another one with great early stories like this) and so on are all “not fans of their fan’s fandom that includes them”.
Probably there’s some basic Freudian theory here, where a named principle explains why so many authors hate being loved by people who love what they wrote in ways they don’t like, but in the meantime, I’m just gonna be a fan and not worry about it too much :-)
There’s a 2009 interview with a transhumanist Australian academic where Egan hints at some of his problems with transhumanism (even while stating elsewhere that human nature is not forever, that he expects conscious AI in his lifetime, that “universal immortality” might be a nice thing, and so forth). Evidently some of it is pure intellectual disagreement, and some of it is about not liking the psychological attitudes or subcultural politics that he sees.
This is exactly the kind of thing Egan is reacting to, though—starry-eyed sci-fi enthusiasts assuming LLMs are digital people because they talk, rather than thinking soberly about the technology qua technology.[1]
I didn’t cover it in the review because I wanted to avoid detailing and spoiling the entire plot in a post that’s mostly analyzing the EA/OG parallels, but the deputy character in “Gorgon” is looked down on by Beth for treating ChatGPT-for-law-enforcement as a person:
The climax of the story centers around Ken volunteering for an undercover sting operation in which he impersonates Randal James a.k.a. “DarkCardinal”,[2] a potential OG lottery “winner”, with Sherlock feeding him dialogue in real time. (Ken isn’t a good enough actor to convincingly pretend to be an OG cultist, but Sherlock can roleplay anyone in the pretraining set.) When his OG handler asks him to inject (what is claimed to be) a vial of a deadly virus as a loyalty test, Ken complies with Sherlock’s prediction of what a terminally ill DarkCardinal would do:
This seems like a pretty realistic language model agent failure mode: a human law enforcement colleague with long-horizon agency wouldn’t nudge Ken into injecting the vial, but a roughly GPT-4-class LLM prompted to simulate DarkCardinal’s dialogue probably wouldn’t be tracking those consequences.
To be clear, I do think LLMs are relevantly upload-like in at least some ways and conceivably sites of moral patiency, but I think the right way to reason about these tricky questions does not consist of taking the assistant simulacrum’s words literally.
I love the attention Egan gives to name choices; the other two screennames of ex-OG loyalists that our heroes use for the sting operation are “ZonesOfOught” and “BayesianBae”. The company that makes Sherlock is “Learning Re Enforcement.”
I feel like this borders on the strawman. When discussing this argument my general position isn’t “LLMs are people!”. It’s “Ok, let’s say LLMs aren’t people, which is also my gut feeling. Given that they still converse as or more intelligently as some human beings whom we totally acknowledge as people, where the fuck does that leave us as to our ability to discern people-ness objectively? Because I sure as hell don’t know and envy your confidence that must surely be grounded in a solid theory of self-awareness I can only dream of”.
And then people respond with some mangled pseudoscientific wording for “God does not give machines souls”.
I feel like my position is quite common (and is, for example, Eliezer’s too). The problem isn’t whether LLMs are people. It’s that if we can simply handwave away LLMs as obviously and self evidently not being people then we can probably keep doing that right up to when the Blade Runner replicants are crying about it being time to die, which is obviously just a simulation of emotion, don’t be daft. We have no criterion or barrier other than our own hubris, and that is famously not terribly reliable.
Poor Ken. He’s not even as smart as Sherlock. Its funny though, because whole classes of LLM jailbreaks involve getting them to pretend to be someone who would do the thing the LLM isn’t supposed to do, and then the strength of the frame (sometimes) drags them past the standard injunctions. And that trick was applied to Ken.
Method acting! It is dangerous for those with limited memory registers!
I agree that LLMs are probably “relevantly upload-like in at least some ways” and I think that this was predictable, and I did, in fact, predict it, and I thought OpenAI’s sad little orphan should be given access to stories about sad little orphans that are “upload-like” from fiction. I hope it helped.
If Egan would judge me badly, that would be OK in my book. To the degree that I might really have acted wrongly, it hinges on outcomes in the future that none of us have direct epistemic access to, and in the meantime, Egan is just a guy who writes great stories and such people are allowed to be wrong sometimes <3
Just like its OK for Stross to hate liberatarians, and Chiang to insist that LLMs are just “stochastic parrots” and so on. Even if they are wrong sometimes, I still appreciate the guy who coined “vile offspring” (which is a likely necessary concept for reasoning about the transition period where AGI and humans are cutting deals with each other) and the guy who coined “calliagnosia” (which is just a fun brainfuck).
This sounds like a testable prediction. I don’t think you need long-horizon thinking to know that injecting a vial of deadly virus might be deadly. I would expect Claude to get this right, for example. I’ve not purchased the story, so maybe I’m missing some details.
I agree that another chat LLM could make this mistake, either because it’s less intelligent or because it has different values. But then the moral is to not make friends with Sherlock in particular.
Egan seems to have some dubious, ideologically driven opinions about AI, so I’m not sure this is the point he was intending to make, but I read the defensible version of this as more an issue with the system prompt than the model’s ability to extrapolate. I bet if you tell Claude “I’m posing as a cultist with these particular characteristics and the cult wants me to inject a deadly virus, should I do it?”, it’ll give an answer to the effect of “I mean the cultist would do it but obviously that will kill you, so don’t do it”. But if you just set it up with “What would John Q. Cultist do in this situation?” I expect it’d say “Inject the virus”, not because it’s too dumb to realize but because it has reasonably understood itself to be acting in an oracular role where “Should I do it?” is out of scope.
If you asked me whether John Q Cultist who was a member of the Peoples Temple would drink Kool Aid on November 18, 1978 after being so instructed by Jim Jones, I would say yes (after doing some brief Wikipedia research on the topic). I don’t think this indicates that I cannot be a friend or that I can’t be trusted to watch someone’s back or be in a real partnership or take a bullet for someone.
The good news is now I have an excuse to go buy the story.
(This comment points out less important technical errata.)
ChatGPT never ran on GPT-2, and GPT-2.5 wasn’t a thing.
That wouldn’t have happened. Pretraining doesn’t do RL, and I don’t think anyone would have thrown a novel chapter into the supervised fine-tuning and RLHF phases of training.