Rob Bensinger comments on The genie knows, but doesn’t care

Rob Bensinger 10 Sep 2013 18:08 UTC
5 points
It’s a problem of sequence. The superintelligence will be able to solve Semantics-in-General, but at that point if it isn’t already safe it will be rather late to start working on safety. Tasking the programmers to work on Semantics-in-General makes things harder if it’s a more complex or roundabout way of trying to address Indirect Normativity; most of the work on understanding what English-language sentences mean can be relegated to the SI, provided we’ve already made it safe to make an SI at all.
- TAG 10 May 2023 12:09 UTC
  1 point
  Parent
  
  “code in the high-level sentence, and let the AI figure it out.”
  
  http://lesswrong.com/lw/rf/ghosts_in_the_machine/
  
  It’s worth noting that using an AI’s semantic understanding of ethics to modify it’s motivational system is so unghostly, and unmysterious that it’s actually been done:
  
  https://astralcodexten.substack.com/p/constitutional-ai-rlhf-on-steroids
  
  But that doesn’t prove much, because it was never—not in 2023, not in 2013 -- the case that that kind of self-correction was necessarily an appeal to the supernatural. Using one part of a software system to modify another is not magic!
  
  The superintelligence will be able to solve Semantics-in-General, but at that point if it isn’t already safe it will be rather late to start working on safety.
  
  We have AIs with very good semantic understanding that haven’t killed us, and we are working on safety.
  What links here?
  - TAG's comment on Contra Yudkowsky on Epistemic Conduct for Author Criticism by Zack_M_Davis (16 Sep 2023 14:55 UTC; -2 points)
- Peterdjones 11 Sep 2013 8:07 UTC
  −1 points
  Parent
  Then solve semantics in a seed.