Eliezer Yudkowsky comments on AGI Ruin: A List of Lethalities

Eliezer Yudkowsky 12 Jun 2022 3:16 UTC
11 points
11
I don’t think we could train an AI to optimize for long-term paperclips. Maybe I’m not “most people in AI alignment” but still, just saying.
- Evan R. Murphy 12 Jun 2022 6:56 UTC
  1 point
  0
  Parent
  I was trying to contrast the myopic paperclip maximizer idea with the classic paperclip maximizer. Perhaps “long-term” was a lousy choice of words. What would be better: simple paperclip maximizer, unconditional paperclip maximizer, or something?
  
  Update: On second thought, maybe what you were getting at is that it’s not clear how to deliberately train a paperclip maximizer in the current paradigm. If you tried, you’d likely end up with a mesa-optimizer on some unpredictable proxy objective, like a deceptively aligned steel maximizer.