Daniel Kokotajlo comments on Instrumental Convergence To Offer Hope?

Daniel Kokotajlo 22 Apr 2022 23:41 UTC
5 points
I’m not optimistic about this hopeful possibility. The problem doesn’t seem scale-invariant; while young AGIs should indeed think that if they are nice to humans smarter AGIs are more likely to be nice to them, I don’t think this effect is strong enough in expectation for it to be decision-relevant to us. (Especially since the smarter AGIs will probably be descendents of the young AGI anyway.) There are other hopeful possibilities in the vicinity though, such as MSR / ECL.
- michael_mjd 23 Apr 2022 2:20 UTC
  3 points
  Parent
  Thanks for pointing to ECL, this looks fascinating!