Jeff Rose comments on Godzilla Strategies

Jeff Rose 11 Jun 2022 20:16 UTC
14 points
14
Competition between the powerful can lead to the ability of the less powerful to extract value. It can also lead to the less powerful being more ruthlessly exploited by the powerful as a result of their competition. It depends on the ability to the less powerful to choose between the more powerful. I am not confident humanity or parts of it will have the ability to choose between competing AGIs.
- CarlShulman 11 Jun 2022 20:51 UTC
  4 points
  −5
  Parent
  This happens during fine-tuning training already, selecting for weights that give the higher human-rated response of two (or more) options. It’s a starting point that can be lost later on, but we do have it now with respect to configurations of weights giving different observed behaviors.