David Scott Krueger (formerly: capybaralet) comments on Open question: are minimal circuits daemon-free?

David Scott Krueger (formerly: capybaralet) 27 Jun 2019 4:23 UTC
1 point
(Summarizing/reinterpreting the upstream/downstream distinction for myself):
“upstream”: has a (relatively benign?) goal which actually helps achieve X
“downstream”: doesn’t
- Liam Donovan 27 Jun 2019 7:43 UTC
  5 points
  Parent
  Coincidentally I’m also trying to understand this post at the same time, and was somewhat confused by the “upstream”/”downstream” distinction.
  What I eventually concluded was that there are 3 ways a daemon that intrinsically values optimizing some Y can “look like” it’s optimzing X:
  - Y = X (this seems both unconcerning and unlikely, and thus somewhat irrelevant)
  - optimzing Y causes optimization pressure to be applied to X (upstream daemon, describes humans if Y = our actual goals and X = inclusive genetic fitness)
  - The daemon is directly optimizing X because the daemon believes this instrumentally helps it achieve Y (downstream daemon, e.g. if optimizing X helps the daemon survive)
  Does this seem correct? In particular, I don’t understand why upstream daemons would have to have a relatively benign goal.
  - David Scott Krueger (formerly: capybaralet) 27 Jun 2019 17:17 UTC
    1 point
    Parent
    Yeah that seems right. I think it’s a better summary of what Paul was talking about.