Section 2.3 seems to be the part that addresses alignment, and the proposed solution is to use reinforcement learning (train the AI on examples of what humans would do) and then to give up (either by leaving a human in the loop forever or just deciding that turning people into paperclips really is better).
The way these kinds of problems keep getting buried deep in the writing (sometimes through linked PDF’s) makes me I really think this is some sort of Sokal-hoax-style prank.
Then maybe the alignment problem is a stupid problem to try to solve? I don’t believe this, and have spent the past five years working on the alignment problem. But your argument certainly seems like a general purpose argument that we could and should surrender our moral duties to a fancy algorithm as a cost-saving measure, and that anyone who opposes that is a technophobe who Does Not Get the Science.
Section 2.3 seems to be the part that addresses alignment, and the proposed solution is to use reinforcement learning (train the AI on examples of what humans would do) and then to give up (either by leaving a human in the loop forever or just deciding that turning people into paperclips really is better).
The way these kinds of problems keep getting buried deep in the writing (sometimes through linked PDF’s) makes me I really think this is some sort of Sokal-hoax-style prank.
What’s so bad about keeping a human in the loop forever? Do we really think we can safely abdicate our moral responsibilities?
It defeats the purpose of AI, so realistically no one will do it
It doesn’t actually solve the problem if the AI is deceptive
I’m not convinced we can safely run AGI, with or without a human in the loop. That’s what the alignment problem is.
Then maybe the alignment problem is a stupid problem to try to solve? I don’t believe this, and have spent the past five years working on the alignment problem. But your argument certainly seems like a general purpose argument that we could and should surrender our moral duties to a fancy algorithm as a cost-saving measure, and that anyone who opposes that is a technophobe who Does Not Get the Science.