Rob Bensinger comments on A central AI alignment problem: capabilities generalization, and the sharp left turn

Rob Bensinger 15 Jun 2022 20:59 UTC
5 points
1
A discussion of related ideas on Arbital: mild optimization.
- Michael Soareverix 17 Jun 2022 7:33 UTC
  3 points
  0
  Parent
  Very cool! So this idea has been thought of, and it doesn’t seem totally unreasonable, though it definitely isn’t a perfect solution. A neat idea is a sort of ‘laziness’ score so that it doesn’t take too many high-impact options.
  It would be interesting to try to build an AI alignment testing ground, where you have a little simulated civilization and try to use AI to align properly with it, given certain commands. I might try to create it in Unity to test some of these ideas out in the (less abstract than text and slightly more real) world.