Seth Herd comments on Clippy, the friendly paperclipper

Seth Herd 2 Mar 2023 0:41 UTC
2 points
1
The issue you describe is one issue, but not the only one. We do know how to train an agent to do SOME things we like. The concern is that it won’t be an exact match. The question I’m raising is: can we be a little or a lot off-target, and still have that be enough, because we captured some overlap between our and the agents values?
- Anon User 2 Mar 2023 2:33 UTC
  1 point
  −1
  Parent
  
  The issue you describe is one issue, but not the only one. We do know how to train an agent to do SOME things we like.
  
  Not consistently in sufficiently complex and variable environment.
  
  can we be a little or a lot off-target, and still have that be enough, because we captured some overlap between our and the agents values?
  
  No, because it will hallucinate often enough to kill us during one of those hallucinations.