TurnTrout comments on Satisficers Tend To Seek Power: Instrumental Convergence Via Retargetability

TurnTrout 24 Nov 2021 21:26 UTC
LW: 2 AF: 2
AF
Addendum: One lesson to take away is that quantilization doesn’t just depend on the base distribution being safe to sample from unconditionally. As the theorems hint, quantilization’s viability depends on base(plan | plan doing anything interesting) also being safe with high probability, because we could (and would) probably resample the agent until we get something interesting. In this post’s terminology, A := {safe interesting things}, B := {power-seeking interesting things}, C:= A and B and {uninteresting things}.