I still don’t understand the motivation. Is the hope that “what <X value learning algorithm> would infer from observing humans in some hypothetical that doesn’t actually happen” is easier to make inferences about than “what humans would do if they thought for a very long time”?
I still don’t understand the motivation. Is the hope that “what <X value learning algorithm> would infer from observing humans in some hypothetical that doesn’t actually happen” is easier to make inferences about than “what humans would do if they thought for a very long time”?