Lorec comments on The hostile telepaths problem

Lorec 31 Oct 2024 16:33 UTC
1 point
0
! I’m genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.

LDT says that, for the purposes of making quasi-Kantian [not really Kantian but that’s the closest thing I can gesture at OTOH that isn’t just “read the Yudkowsky”] correct decisions, you have to treat the hostile telepaths as copies of yourself.

Indexical uncertainty, ie not knowing whether you’re in Omega’s simulation or the real world, means that, even if “I would never do that”, if someone is “doing that” to me, in ways I can’t ignore, I have to act as though I might ever be in a situation where I’m basically forced to “do that”.

I can still preferentially withhold reward from copies of myself that are executing quasi-threats, though. And in fact this is correct because it minimizes quasi-threats in the mutual copies-of-myself negotiating equilibrium.

“Acquire the ability to coerce, rather than being coerced by, other agents in my environment”, is not a solution to anything—because the quasi-Rawlsian [again, not really Rawlsian, but I don’t have any better non-Yudkowsky reference points OTOH] perspective means that if you precommit to acquire power, you end up in expectation getting trodden on just as much as you trod on the other copies of you. So you’re right back where you started.

Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.

And I think “be willing to back deceptions” is in fact such a socially-orthogonal improvement.
- Valentine 31 Oct 2024 17:45 UTC
  2 points
  0
  Parent
  ! I’m genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.
  Thanks. :)
  And thanks for explaining. I’m not sure what “quasi-Kantian” or “quasi-Rawlsian” mean, and I’m not sure which piece of Eliezer’s material you’re gesturing toward, so I think I’m missing some key steps of reasoning.
  But on the whole, yeah, I mean defensive power rather than offensive. The offensive stuff is relevant only to the extent that it works for defense. At least that’s how it seems to me! I haven’t thought about it very carefully. But the whole point is, what could make me safe if a hostile telepath discovers a truth in me? The “build power” family of solutions is based on neutralizing the relevance of the “hostile” part.
  I think you’re saying something more sophisticated than this. I’m not entirely sure what it is. Like here you say:
  Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.
  I’m not sure what “the lineup” refers to, so I don’t know what it means for something to be orthogonal to my position in it.
  I think I follow and agree with what you’re saying if I just reason in terms of “setting up arms races is bad, all else being equal”.
  Or to be more precise, if I take the dangers of adaptive entropy seriously and I view “create adaptive entropy to get ahead” as a confused pseudo-solution. It might be that that’s my LDT-like framework.
  - Lorec 31 Oct 2024 20:30 UTC
    1 point
    0
    Parent
    I once thought “slack mattered more than any outcome”. But whose slack? It’s wonderful for all humans to have more slack. But there’s a huge game-theoretic difference between the species being wealthier, and thus wealthier per capita, and being wealthy/high-status/dominant/powerful relative to other people. The first is what I was getting at by “things orthogonal to the lineup”; the second is “the lineup”. Trying to improve your position relative to copies of yourself in a way that is zero-sum is “the rat race”, or “the Red Queen’s race”, where running will ~only ever keep you in the same place, and cause you and your mirror-selves to expend a lot of effort that is useless if you don’t enjoy it.
    
    [I think I enjoy any amount of “the rat race”, which is part of why I find myself doing any of it, even though I can easily imagine tweaking my mind such that I stop doing it and thus exit an LDT negotiation equilibrium where I need to do it all the time. But I only like it so much, and only certain kinds.]