Jordan Taylor comments on A list of core AI safety problems and how I hope to solve them

Jordan Taylor Aug 26, 2023, 11:37 PM
1 point
0
I guess the shutdown timer would be most important in the training stage, so that it (hopefully) learns only to care about the short term.
- davidad Aug 27, 2023, 6:21 AM
  2 points
  0
  Parent
  Yes, the “shutdown timer” mechanism is part of the policy-scoring function that is used during policy optimization. OAA has multiple stages that could be considered “training”, and policy optimization is the one that is closest to the end, so I wouldn’t call it “the training stage”, but it certainly isn’t the deployment stage.
  
  We hope not merely that the policy only cares about the short term, but also that it cares quite a lot about gracefully shutting itself down on time.