I’m not optimistic about this hopeful possibility. The problem doesn’t seem scale-invariant; while young AGIs should indeed think that if they are nice to humans smarter AGIs are more likely to be nice to them, I don’t think this effect is strong enough in expectation for it to be decision-relevant to us. (Especially since the smarter AGIs will probably be descendents of the young AGI anyway.) There are other hopeful possibilities in the vicinity though, such as MSR / ECL.
I’m not optimistic about this hopeful possibility. The problem doesn’t seem scale-invariant; while young AGIs should indeed think that if they are nice to humans smarter AGIs are more likely to be nice to them, I don’t think this effect is strong enough in expectation for it to be decision-relevant to us. (Especially since the smarter AGIs will probably be descendents of the young AGI anyway.) There are other hopeful possibilities in the vicinity though, such as MSR / ECL.
Thanks for pointing to ECL, this looks fascinating!