Ofer comments on AGI safety from first principles: Introduction

Ofer 1 Jan 2021 11:46 UTC
LW: 7 AF: 4
AF

Early work tends to be less relevant in the context of modern machine learning

I’m curious why you think the orthogonality thesis, instrumental convergence, the treacherous turn or Goodhart’s law arguments are less relevant in the context of modern machine learning. (We can use here Facebook’s feed-creation-algorithm as an example of modern machine learning, for the sake of concreteness.)
- DragonGod 15 Dec 2022 11:21 UTC
  1 point
  Parent
  It seems to me that the orthogonality thesis doesn’t apply to modern machine learning? None of the techniques we know for training AI systems to general capabilities seem like they are compatible with creating a paperclip maximiser:
  - Unsupervised learning on human generated/curated data
  - Reinforcement learning from human feedback
  - Imitation Learning
  This is especially the case if shard theory is true. Training an AI across diverse domains/tasks may of necessity result in the AI forming value shards useful/relevant to those tasks
  That is even if we wanted to, we couldn’t necessarily create a generally intelligent paperclip maximiser.
  We could create an agent that values paperclips (humans value paperclips to some extent), but most agents that value paperclips aren’t paperclip maximisers.