You definitely highlight that there’s a continuum here, from “most deliberation-like” being actual humans sitting around thinking, to “least deliberation-like” being highly-abstracted machine learning schemes that don’t look much at all like a human sitting around thinking, and in fact extend this continuum past “doing meta-philosophy” and towards the realm of “doing meta-ethics.”
However, I’d argue against the notion (just arguing in general, not sure if this was your implication) that this continuum is about a tradeoff between quality and practicality—that “more like the humans sitting around is better if we can do it.” I think deliberation has some bad parts—dumb things humans will predictably do, subagent-alignment problems humans will inevitably cause, novel stimuli humans will predictably be bad at evaluating. FAI schemes that move away from pure deliberation might not just do so for reasons of practicality, they might be doing it because they think it actually serves their purpose best.
Thanks for the post!
You definitely highlight that there’s a continuum here, from “most deliberation-like” being actual humans sitting around thinking, to “least deliberation-like” being highly-abstracted machine learning schemes that don’t look much at all like a human sitting around thinking, and in fact extend this continuum past “doing meta-philosophy” and towards the realm of “doing meta-ethics.”
However, I’d argue against the notion (just arguing in general, not sure if this was your implication) that this continuum is about a tradeoff between quality and practicality—that “more like the humans sitting around is better if we can do it.” I think deliberation has some bad parts—dumb things humans will predictably do, subagent-alignment problems humans will inevitably cause, novel stimuli humans will predictably be bad at evaluating. FAI schemes that move away from pure deliberation might not just do so for reasons of practicality, they might be doing it because they think it actually serves their purpose best.
I agree with this, and didn’t mean to imply anything against it in the post.