I still don’t understand the motivation. Is the hope that “what <X value learning algorithm> would infer from observing humans in some hypothetical that doesn’t actually happen” is easier to make inferences about than “what humans would do if they thought for a very long time”?
This idea is actually very similar to Paul’s idea, but doesn’t require such an ideal setup.
I still don’t understand the motivation. Is the hope that “what <X value learning algorithm> would infer from observing humans in some hypothetical that doesn’t actually happen” is easier to make inferences about than “what humans would do if they thought for a very long time”?