I am not sure how much ‘not destabilize people’ is an option that is available to Vassar.
My model of Vassar is as a person who is constantly making associations, and using them to point at the moon. However, pointing at the moon can convince people of nonexistent satellites and thus drive people crazy. This is why we have debates instead of koan contests.
Pointing at the moon is useful when there is inferential distance; we use it all the time when talking with people without rationality training. Eliezer used it, and a lot of “you are expected to behave better for status reasons look at my smug language”-style theist-bashing, in the Sequences. This was actually highly effective, although it had terrible side effects.
I think that if Vassar tried not to destabilize people, it would heavily impede his general communication. He just talks like this. One might say, “Vassar, just only say things that you think will have a positive effect on the person.” 1. He already does that. 2. That is advocating that Vassar manipulate people. See Valencia in Worth the Candle.
In the pathological case of Vassar, I think the naive strategy of “just say the thing you think is true” is still correct.
Mental training absolutely helps. I would say that, considering that the people who talk with Vassar are literally from a movement called rationality, it is a normatively reasonable move to expect them to be mentally resilient. Factually, this is not the case. The “maybe insane” part is definitely not unavoidable, but right now I think the problem is with the people talking to Vassar, and not he himself.
I think that if Vassar tried not to destabilize people, it would heavily impede his general communication.
My suggestion for Vassar is not to ‘try not to destabilize people’ exactly.
It’s to very carefully examine his speech and its impacts, by looking at the evidence available (asking people he’s interacted with about what it’s like to listen to him) and also learning how to be open to real-time feedback (like, actually look at the person you’re speaking to as though they’re a full, real human—not a pair of ears to be talked into or a mind to insert things into). When he talks theory, I often get the sense he is talking “at” rather than talking “to” or “with”. The listener practically disappears or is reduced to a question-generating machine that gets him to keep saying things.
I expect this process could take a long time / run into issues along the way, and so I don’t think it should be rushed. Not expecting a quick change. But claiming there’s no available option seems wildly wrong to me. People aren’t fixed points and generally shouldn’t be treated as such.
This is actually very fair. I think he does kind of insert information into people.
I never really felt like a question-generating machine, more like a pupil at the foot of a teacher who is trying to integrate the teacher’s information.
I think the passive, reactive approach you mention is actually a really good idea of how to be more evidential in personal interaction without being explicitly manipulative.
It’s to very carefully examine his speech and its impacts, by looking at the evidence available (asking people he’s interacted with about what it’s like to listen to him) and also learning how to be open to real-time feedback (like, actually look at the person you’re speaking to as though they’re a full, real human—not a pair of ears to be talked into or a mind to insert things into).
I think I interacted with Vassar four times in person, so I might get some things wrong here, but I think that he’s pretty disassociated from his body which closes a normal channel of perceiving impacts on the person he’s speaking with. This thing looks to me like some bodily process generating stress / pain and being a cause for disassociation. It might need a body worker to fix whatever goes on there to create the conditions for perceiving the other person better.
Beyond that Circling might be an enviroment in which one can learn to interact with others as humans who have their own feelings but that would require opening up to the Circling frame.
I think that if Vassar tried not to destabilize people, it would heavily impede his general communication. He just talks like this. One might say, “Vassar, just only say things that you think will have a positive effect on the person.” 1. He already does that. 2. That is advocating that Vassar manipulate people.
You are making a false dichomaty here. You are assuming that everything that has a negative effect on a person is manipulation.
As Vassar himself sees the situation people believe a lot of lies for reasons of fitting in socially in society. From that perspective getting people to stop believing in those lies will make it harder to fit socially into society.
If you would get a Nazi guard at Ausschwitz into a state where the moral issue of their job can’t be disassociated anymore, that’s very predicably going to have a negative effect on that prison guard.
Vassar position would be that it would be immoral to avoid talking about the truth about the nature of their job when talking with the guard in a motivation to make life easier for the guard.
I think this line of discussion would be well served by marking a natural boundary in the cluster “crazy.” Instead of saying “Vassar can drive people crazy” I’d rather taboo “crazy” and say:
Many people are using their verbal idea-tracking ability to implement a coalitional strategy instead of efficiently compressing external reality. Some such people will experience their strategy as invalidated by conversations with Vassar, since he’ll point out ways their stories don’t add up. A common response to invalidation is to submit to the invalidator by adopting the invalidator’s story. Since Vassar’s words aren’t selected to be a valid coalitional strategy instruction set, attempting to submit to him will often result in attempting obviously maladaptive coalitional strategies.
People using their verbal idea-tracking ability to implement a coalitional strategy cannot give informed consent to conversations with Vassar, because in a deep sense they cannot be informed of things through verbal descriptions, and the risk is one that cannot be described without the recursive capacity of descriptive language.
Personally I care much more, maybe lexically more, about the upside of minds learning about their situation, than the downside of mimics going into maladaptive death spirals, though it would definitely be better all round if we can manage to cause fewer cases of the latter without compromising the former, much like it’s desirable to avoid torturing animals, and it would be desirable for city lights not to interfere with sea turtles’ reproductive cycle by resembling the moon too much.
EDIT: Ben is correct to say we should taboo “crazy.”
This is a very uncharitable interpretation (entirely wrong). The highly scrupulous people here can undergo genuine psychological collapse if they learn their actions aren’t as positive utility as they thought. (entirely wrong)
I also don’t think people interpret Vassar’s words as a strategy and implement incoherence. Personally, I interpreted Vassar’s words as factual claims then tried to implement a strategy on them. When I was surprised by reality a bunch, I updated away. I think the other people just no longer have a coalitional strategy installed and don’t know how to function without one. This is what happened to me and why I repeatedly lashed out at others when I perceived them as betraying me, since I no longer automatically perceived them as on my side. I rebuilt my rapport with those people and now have more honest relationships with them. (still endorsed)
The highly scrupulous people here can undergo genuine psychological collapse if they learn their actions aren’t as positive utility as they thought.
“That which can be destroyed by the truth should be”—I seem to recall reading that somewhere.
And: “If my actions aren’t as positive utility as I think, then I desire to believe that my actions aren’t as positive utility as I think”.
If one has such a mental makeup that finding out that one’s actions have worse effects than one imagined causes genuine psychological collapse, then perhaps the first order of business is to do everything in one’s power to fix that (really quite severe and glaring) bug in one’s psyche—and only then to attempt any substantive projects in the service of world-saving, people-helping, or otherwise doing anything really consequential.
Personally, I interpreted Vassar’s words as factual claims then tried to implement a strategy on them. When I was surprised by reality a bunch, I updated away.
What specific claims turned out to be false? What counterevidence did you encounter?
Specific claim: the only nontrivial obstacle in front of us is not being evil
This is false. Object-level stuff is actually very hard.
Specific claim: nearly everyone in the aristocracy is agentically evil. (EDIT: THIS WAS NOT SAID. WE BASICALLY AGREE ON THIS SUBJECT.)
This is a wrong abstraction. Frame of Puppets seems naively correct to me, and has become increasingly reified by personal experience of more distant-to-my-group groups of people, to use a certain person’s language. Ideas and institutions have the agency; they wear people like skin.
Specific claim: this is how to take over New York.
Specific claim: this is how to take over New York.
Didn’t work.
I think this needs to be broken up into 2 claims:
1 If we execute strategy X, we’ll take over New York.
2 We can use straightforward persuasion (e.g. appeals to reason, profit motive) to get an adequate set of people to implement strategy X.
2 has been falsified decisively. The plan to recruit candidates via appealing to people’s explicit incentives failed, there wasn’t a good alternative, and as a result there wasn’t a chance to test other parts of the plan (1).
That’s important info and worth learning from in a principled way. Definitely I won’t try that sort of thing again in the same way, and it seems like I should increase my credence both that plans requiring people to respond to economic incentives by taking initiative to play against type will fail, and that I personally might be able to profit a lot by taking initiative to play against type, or investing in people who seem like they’re already doing this, as long as I don’t have to count on other unknown people acting similarly in the future.
But I find the tendency to respond to novel multi-step plans that would require someone do take initiative by sitting back and waiting for the plan to fail, and then saying, “see? novel multi-step plans don’t work!” extremely annoying. I’ve been on both sides of that kind of transaction, but if we want anything to work out well we have to distinguish cases of “we / someone else decided not to try” as a different kind of failure from “we tried and it didn’t work out.”
Specific claim: the only nontrivial obstacle in front of us is not being evil
This is false. Object-level stuff is actually very hard.
This seems to be conflating the question of “is it possible to construct a difficult problem?” with the question of “what’s the rate-limiting problem?”. If you have a specific model for how to make things much better for many people by solving a hard technical problem before making substantial progress on human alignment, I’d very much like to hear the details. If I’m persuaded I’ll be interested in figuring out how to help.
So far this seems like evidence to the contrary, though, as it doesn’t look like you thought you could get help making things better for many people by explaining the opportunity.
I am not sure how much ‘not destabilize people’ is an option that is available to Vassar.
My model of Vassar is as a person who is constantly making associations, and using them to point at the moon. However, pointing at the moon can convince people of nonexistent satellites and thus drive people crazy. This is why we have debates instead of koan contests.
Pointing at the moon is useful when there is inferential distance; we use it all the time when talking with people without rationality training. Eliezer used it, and a lot of “you are expected to behave better for status reasons look at my smug language”-style theist-bashing, in the Sequences. This was actually highly effective, although it had terrible side effects.
I think that if Vassar tried not to destabilize people, it would heavily impede his general communication. He just talks like this. One might say, “Vassar, just only say things that you think will have a positive effect on the person.” 1. He already does that. 2. That is advocating that Vassar manipulate people. See Valencia in Worth the Candle.
In the pathological case of Vassar, I think the naive strategy of “just say the thing you think is true” is still correct.
Mental training absolutely helps. I would say that, considering that the people who talk with Vassar are literally from a movement called rationality, it is a normatively reasonable move to expect them to be mentally resilient. Factually, this is not the case. The “maybe insane” part is definitely not unavoidable, but right now I think the problem is with the people talking to Vassar, and not he himself.
I’m glad you enjoyed the post.
My suggestion for Vassar is not to ‘try not to destabilize people’ exactly.
It’s to very carefully examine his speech and its impacts, by looking at the evidence available (asking people he’s interacted with about what it’s like to listen to him) and also learning how to be open to real-time feedback (like, actually look at the person you’re speaking to as though they’re a full, real human—not a pair of ears to be talked into or a mind to insert things into). When he talks theory, I often get the sense he is talking “at” rather than talking “to” or “with”. The listener practically disappears or is reduced to a question-generating machine that gets him to keep saying things.
I expect this process could take a long time / run into issues along the way, and so I don’t think it should be rushed. Not expecting a quick change. But claiming there’s no available option seems wildly wrong to me. People aren’t fixed points and generally shouldn’t be treated as such.
This is actually very fair. I think he does kind of insert information into people.
I never really felt like a question-generating machine, more like a pupil at the foot of a teacher who is trying to integrate the teacher’s information.
I think the passive, reactive approach you mention is actually a really good idea of how to be more evidential in personal interaction without being explicitly manipulative.
Thanks!
I think I interacted with Vassar four times in person, so I might get some things wrong here, but I think that he’s pretty disassociated from his body which closes a normal channel of perceiving impacts on the person he’s speaking with. This thing looks to me like some bodily process generating stress / pain and being a cause for disassociation. It might need a body worker to fix whatever goes on there to create the conditions for perceiving the other person better.
Beyond that Circling might be an enviroment in which one can learn to interact with others as humans who have their own feelings but that would require opening up to the Circling frame.
You are making a false dichomaty here. You are assuming that everything that has a negative effect on a person is manipulation.
As Vassar himself sees the situation people believe a lot of lies for reasons of fitting in socially in society. From that perspective getting people to stop believing in those lies will make it harder to fit socially into society.
If you would get a Nazi guard at Ausschwitz into a state where the moral issue of their job can’t be disassociated anymore, that’s very predicably going to have a negative effect on that prison guard.
Vassar position would be that it would be immoral to avoid talking about the truth about the nature of their job when talking with the guard in a motivation to make life easier for the guard.
I think this line of discussion would be well served by marking a natural boundary in the cluster “crazy.” Instead of saying “Vassar can drive people crazy” I’d rather taboo “crazy” and say:
Personally I care much more, maybe lexically more, about the upside of minds learning about their situation, than the downside of mimics going into maladaptive death spirals, though it would definitely be better all round if we can manage to cause fewer cases of the latter without compromising the former, much like it’s desirable to avoid torturing animals, and it would be desirable for city lights not to interfere with sea turtles’ reproductive cycle by resembling the moon too much.
My problem with this comment is it takes people who:
can’t verbally reason without talking things through (and are currently stuck in a passive role in a conversation)
and who:
respond to a failure of their verbal reasoning
under circumstances of importance (in this case moral importance)
and conditions of stress, induced by
trying to concentrate while in a passive role
failing to concentrate under conditions of high moral importance
by simply doing as they are told—and it assumes they are incapable of reasoning under any circumstances.
It also then denies people who are incapable of independent reasoning the right to be protected from harm.
EDIT: Ben is correct to say we should taboo “crazy.”
This is a very uncharitable interpretation (entirely wrong). The highly scrupulous people here can undergo genuine psychological collapse if they learn their actions aren’t as positive utility as they thought. (entirely wrong)
I also don’t think people interpret Vassar’s words as a strategy and implement incoherence. Personally, I interpreted Vassar’s words as factual claims then tried to implement a strategy on them. When I was surprised by reality a bunch, I updated away. I think the other people just no longer have a coalitional strategy installed and don’t know how to function without one. This is what happened to me and why I repeatedly lashed out at others when I perceived them as betraying me, since I no longer automatically perceived them as on my side. I rebuilt my rapport with those people and now have more honest relationships with them. (still endorsed)
Beyond this, I think your model is accurate.
“That which can be destroyed by the truth should be”—I seem to recall reading that somewhere.
And: “If my actions aren’t as positive utility as I think, then I desire to believe that my actions aren’t as positive utility as I think”.
If one has such a mental makeup that finding out that one’s actions have worse effects than one imagined causes genuine psychological collapse, then perhaps the first order of business is to do everything in one’s power to fix that (really quite severe and glaring) bug in one’s psyche—and only then to attempt any substantive projects in the service of world-saving, people-helping, or otherwise doing anything really consequential.
Thank you for echoing common sense!
What is psychological collapse?
For those who can afford it, taking it easy for a while is a rational response to noticing deep confusion, continuing to take actions based on a discredited model would be less appealing, and people often become depressed when they keep confusedly trying to do things that they don’t want to do.
Are you trying to point to something else?
What specific claims turned out to be false? What counterevidence did you encounter?
Specific claim: the only nontrivial obstacle in front of us is not being evil
This is false. Object-level stuff is actually very hard.
Specific claim: nearly everyone in the aristocracy is agentically evil. (EDIT: THIS WAS NOT SAID. WE BASICALLY AGREE ON THIS SUBJECT.)
This is a wrong abstraction. Frame of Puppets seems naively correct to me, and has become increasingly reified by personal experience of more distant-to-my-group groups of people, to use a certain person’s language. Ideas and institutions have the agency; they wear people like skin.
Specific claim: this is how to take over New York.
Didn’t work.
I think this needs to be broken up into 2 claims:
1 If we execute strategy X, we’ll take over New York. 2 We can use straightforward persuasion (e.g. appeals to reason, profit motive) to get an adequate set of people to implement strategy X.
2 has been falsified decisively. The plan to recruit candidates via appealing to people’s explicit incentives failed, there wasn’t a good alternative, and as a result there wasn’t a chance to test other parts of the plan (1).
That’s important info and worth learning from in a principled way. Definitely I won’t try that sort of thing again in the same way, and it seems like I should increase my credence both that plans requiring people to respond to economic incentives by taking initiative to play against type will fail, and that I personally might be able to profit a lot by taking initiative to play against type, or investing in people who seem like they’re already doing this, as long as I don’t have to count on other unknown people acting similarly in the future.
But I find the tendency to respond to novel multi-step plans that would require someone do take initiative by sitting back and waiting for the plan to fail, and then saying, “see? novel multi-step plans don’t work!” extremely annoying. I’ve been on both sides of that kind of transaction, but if we want anything to work out well we have to distinguish cases of “we / someone else decided not to try” as a different kind of failure from “we tried and it didn’t work out.”
This is actually completely fair. So is the other comment.
This seems to be conflating the question of “is it possible to construct a difficult problem?” with the question of “what’s the rate-limiting problem?”. If you have a specific model for how to make things much better for many people by solving a hard technical problem before making substantial progress on human alignment, I’d very much like to hear the details. If I’m persuaded I’ll be interested in figuring out how to help.
So far this seems like evidence to the contrary, though, as it doesn’t look like you thought you could get help making things better for many people by explaining the opportunity.