What I see to be the main message of the article as currently written is that humans controlling a very powerful tool (especially AI) could drive themselves into a suboptimal fixed point due to insufficient philosophical sophistication.
This I agree with.
Hurrah!
At this round of edits, my main objection would be to the remark that the AI wants us to act as yes-men, which seems dubious if the agent is (i) an Act-based agent or (ii) sufficiently broadly uncertain over values.
I no longer think it wants us to turn into yes-men, and edited my post accordingly. I still think it will be incentivized to corrupt us, and I don’t see how being an act-based agent would be sufficient, though it’s likely I’m missing something. I agree that if it’s sufficiently broadly uncertain over values then we’re likely to be fine, but in my head that unpacks into “if we knew the AI were metaphilosophically competent enough, we’d be fine”, which doesn’t help things much.
Thanks Ryan!
Hurrah!
I no longer think it wants us to turn into yes-men, and edited my post accordingly. I still think it will be incentivized to corrupt us, and I don’t see how being an act-based agent would be sufficient, though it’s likely I’m missing something. I agree that if it’s sufficiently broadly uncertain over values then we’re likely to be fine, but in my head that unpacks into “if we knew the AI were metaphilosophically competent enough, we’d be fine”, which doesn’t help things much.