I think this means that if you care both about (a) wholesomeness and (b) ending self-deception, it’s helpful to give yourself full permission to lie as a temporary measure as needed. Creating space for yourself so you can (say) coherently build power such that it’s safe for you to eventually be fully honest.
The first sentence here, I think, verbalizes something important.
The second [instrumental-power] is a bad justification, to the extent that we’re talking about game-theoretic power [as opposed to power over reductionistic, non-mentalizing Nature]. LDT is about dealing with copies of myself. They’ll all just do the same thing [lie for power] and create needless problems.
You do give a good justification that, I think, doesn’t create any needless aggression between copies of oneself, and which I think suffices to justify “backing self-deception” as promising:
I mean something more wholehearted. If I self-deceive, it’s because it’s the best solution I have to some hostile telepath problem. If I don’t have a better solution, then I want to keep deceiving myself. I don’t just tolerate it. I actively want it there. I’ll fight to keep it there! [...]
This works way better if I trust my occlumency skills here. If I don’t feel like I have to reveal the self-deceptions I notice to others, and I trust that I can and will hide it from others if need be, then I’m still safe from hostile telepaths.
[emphases mine]
“I’m not going to draw first, but drawing second and shooting faster is what I’m all about” but for information theory.
I think the word “power” might be creating some confusion here.
I mean something pretty specific and very practical. I’m not sure how to precisely define it, but here are some examples:
If someone threatens to freak out at you if you disagree with them, and you tend to get overwhelmed and panic when the freak out at you, then they have a kind of power over you. Building power here probably looks like learning to experience them freaking out without you getting overwhelmed.
If someone pays for your rent and food but might stop if they get any hint that you’re gay, it might not be safe to even ask yourself honestly whether you are. You build power here by getting an income, or a source of rent and food, that doesn’t depend on the hostile telepathic benefactor.
If your lover gets turned on by you politically agreeing with them and turned off by disagreement, you might find your political views drifting toward theirs for “unrelated” reasons. One way to build power here is to get other access to sex. Another is to diminish your libido. Another is to break up with them. (Not saying any of these are a great idea. I’m just naming what the solution of “building power” might look like here.)
I’m not familiar with LDT. I can’t comment on that part. Sorry if that means what I just said misses your point.
! I’m genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.
LDT says that, for the purposes of making quasi-Kantian [not really Kantian but that’s the closest thing I can gesture at OTOH that isn’t just “read the Yudkowsky”] correct decisions, you have to treat the hostile telepaths as copies of yourself.
Indexical uncertainty, ie not knowing whether you’re in Omega’s simulation or the real world, means that, even if “I would never do that”, if someone is “doing that” to me, in ways I can’t ignore, I have to act as though I might ever be in a situation where I’m basically forced to “do that”.
I can still preferentially withhold reward from copies of myself that are executing quasi-threats, though. And in fact this is correct because it minimizes quasi-threats in the mutual copies-of-myself negotiating equilibrium.
“Acquire the ability to coerce, rather than being coerced by, other agents in my environment”, is not a solution to anything—because the quasi-Rawlsian [again, not really Rawlsian, but I don’t have any better non-Yudkowsky reference points OTOH] perspective means that if you precommit to acquire power, you end up in expectation getting trodden on just as much as you trod on the other copies of you. So you’re right back where you started.
Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.
And I think “be willing to back deceptions” is in fact such a socially-orthogonal improvement.
! I’m genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.
Thanks. :)
And thanks for explaining. I’m not sure what “quasi-Kantian” or “quasi-Rawlsian” mean, and I’m not sure which piece of Eliezer’s material you’re gesturing toward, so I think I’m missing some key steps of reasoning.
But on the whole, yeah, I mean defensive power rather than offensive. The offensive stuff is relevant only to the extent that it works for defense. At least that’s how it seems to me! I haven’t thought about it very carefully. But the whole point is, what could make me safe if a hostile telepath discovers a truth in me? The “build power” family of solutions is based on neutralizing the relevance of the “hostile” part.
I think you’re saying something more sophisticated than this. I’m not entirely sure what it is. Like here you say:
Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.
I’m not sure what “the lineup” refers to, so I don’t know what it means for something to be orthogonal to my position in it.
I think I follow and agree with what you’re saying if I just reason in terms of “setting up arms races is bad, all else being equal”.
Or to be more precise, if I take the dangers of adaptive entropy seriously and I view “create adaptive entropy to get ahead” as a confused pseudo-solution. It might be that that’s my LDT-like framework.
I once thought “slack mattered more than any outcome”. But whose slack? It’s wonderful for all humans to have more slack. But there’s a huge game-theoretic difference between the species being wealthier, and thus wealthier per capita, and being wealthy/high-status/dominant/powerful relative to other people. The first is what I was getting at by “things orthogonal to the lineup”; the second is “the lineup”. Trying to improve your position relative to copies of yourself in a way that is zero-sum is “the rat race”, or “the Red Queen’s race”, where running will ~only ever keep you in the same place, and cause you and your mirror-selves to expend a lot of effort that is useless if you don’t enjoy it.
[I think I enjoy any amount of “the rat race”, which is part of why I find myself doing any of it, even though I can easily imagine tweaking my mind such that I stop doing it and thus exit an LDT negotiation equilibrium where I need to do it all the time. But I only like it so much, and only certain kinds.]
The first sentence here, I think, verbalizes something important.
The second [instrumental-power] is a bad justification, to the extent that we’re talking about game-theoretic power [as opposed to power over reductionistic, non-mentalizing Nature]. LDT is about dealing with copies of myself. They’ll all just do the same thing [lie for power] and create needless problems.
You do give a good justification that, I think, doesn’t create any needless aggression between copies of oneself, and which I think suffices to justify “backing self-deception” as promising:
[emphases mine]
“I’m not going to draw first, but drawing second and shooting faster is what I’m all about” but for information theory.
I think the word “power” might be creating some confusion here.
I mean something pretty specific and very practical. I’m not sure how to precisely define it, but here are some examples:
If someone threatens to freak out at you if you disagree with them, and you tend to get overwhelmed and panic when the freak out at you, then they have a kind of power over you. Building power here probably looks like learning to experience them freaking out without you getting overwhelmed.
If someone pays for your rent and food but might stop if they get any hint that you’re gay, it might not be safe to even ask yourself honestly whether you are. You build power here by getting an income, or a source of rent and food, that doesn’t depend on the hostile telepathic benefactor.
If your lover gets turned on by you politically agreeing with them and turned off by disagreement, you might find your political views drifting toward theirs for “unrelated” reasons. One way to build power here is to get other access to sex. Another is to diminish your libido. Another is to break up with them. (Not saying any of these are a great idea. I’m just naming what the solution of “building power” might look like here.)
I’m not familiar with LDT. I can’t comment on that part. Sorry if that means what I just said misses your point.
! I’m genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.
LDT says that, for the purposes of making quasi-Kantian [not really Kantian but that’s the closest thing I can gesture at OTOH that isn’t just “read the Yudkowsky”] correct decisions, you have to treat the hostile telepaths as copies of yourself.
Indexical uncertainty, ie not knowing whether you’re in Omega’s simulation or the real world, means that, even if “I would never do that”, if someone is “doing that” to me, in ways I can’t ignore, I have to act as though I might ever be in a situation where I’m basically forced to “do that”.
I can still preferentially withhold reward from copies of myself that are executing quasi-threats, though. And in fact this is correct because it minimizes quasi-threats in the mutual copies-of-myself negotiating equilibrium.
“Acquire the ability to coerce, rather than being coerced by, other agents in my environment”, is not a solution to anything—because the quasi-Rawlsian [again, not really Rawlsian, but I don’t have any better non-Yudkowsky reference points OTOH] perspective means that if you precommit to acquire power, you end up in expectation getting trodden on just as much as you trod on the other copies of you. So you’re right back where you started.
Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.
And I think “be willing to back deceptions” is in fact such a socially-orthogonal improvement.
Thanks. :)
And thanks for explaining. I’m not sure what “quasi-Kantian” or “quasi-Rawlsian” mean, and I’m not sure which piece of Eliezer’s material you’re gesturing toward, so I think I’m missing some key steps of reasoning.
But on the whole, yeah, I mean defensive power rather than offensive. The offensive stuff is relevant only to the extent that it works for defense. At least that’s how it seems to me! I haven’t thought about it very carefully. But the whole point is, what could make me safe if a hostile telepath discovers a truth in me? The “build power” family of solutions is based on neutralizing the relevance of the “hostile” part.
I think you’re saying something more sophisticated than this. I’m not entirely sure what it is. Like here you say:
I’m not sure what “the lineup” refers to, so I don’t know what it means for something to be orthogonal to my position in it.
I think I follow and agree with what you’re saying if I just reason in terms of “setting up arms races is bad, all else being equal”.
Or to be more precise, if I take the dangers of adaptive entropy seriously and I view “create adaptive entropy to get ahead” as a confused pseudo-solution. It might be that that’s my LDT-like framework.
I once thought “slack mattered more than any outcome”. But whose slack? It’s wonderful for all humans to have more slack. But there’s a huge game-theoretic difference between the species being wealthier, and thus wealthier per capita, and being wealthy/high-status/dominant/powerful relative to other people. The first is what I was getting at by “things orthogonal to the lineup”; the second is “the lineup”. Trying to improve your position relative to copies of yourself in a way that is zero-sum is “the rat race”, or “the Red Queen’s race”, where running will ~only ever keep you in the same place, and cause you and your mirror-selves to expend a lot of effort that is useless if you don’t enjoy it.
[I think I enjoy any amount of “the rat race”, which is part of why I find myself doing any of it, even though I can easily imagine tweaking my mind such that I stop doing it and thus exit an LDT negotiation equilibrium where I need to do it all the time. But I only like it so much, and only certain kinds.]