Not sure what makes you think ‘strawmen’ at 2, but I can try to unpack this more for you.
Many warnings about unaligned AI start with the observation that it is a very bad idea to put some naively constructed reward function, like ‘maximize paper clip production’, into a sufficiently powerful AI. Nowadays on this forum, this is often called the ‘outer alignment’ problem. If you are truly worried about this problem and its impact on human survival, then it follows that you should be interested in doing the Hard Thing of helping people all over the world write less naively constructed reward functions to put into their future AIs.
John writes:
Far and away the most common failure mode among self-identifying alignment researchers is to look for Clever Ways To Avoid Doing Hard Things. [...] The most common pattern along these lines is to propose outsourcing the Hard Parts to some future AI [...]
This pattern of outsourcing the Hard Part to the AI is definitely on display when it comes to 2 above. Academic AI/ML research also tends to ignore this Hard Part entirely, and implicitely outsources it to applied AI researchers, or even to the end users.
I generally agree with you on the principle Tackle the Hamming Problems, Don’t Avoid Them.
That being said, some of the Hamming problems I see that are being avoided most on this forum, and in the AI alignment community, are
Do something that will affect policy in a positive way
Pick some actual human values, and then hand-encode these values into open source software components that can go into AI reward functions
I agree with 1 (but then it is called alignment forum, not the more general AI Safety forum). But I don’t see that 2 would do much good.
All narratives I can think of where 2 plays a significant part sounds like strawmen to me, perhaps you could help me?
Not sure what makes you think ‘strawmen’ at 2, but I can try to unpack this more for you.
Many warnings about unaligned AI start with the observation that it is a very bad idea to put some naively constructed reward function, like ‘maximize paper clip production’, into a sufficiently powerful AI. Nowadays on this forum, this is often called the ‘outer alignment’ problem. If you are truly worried about this problem and its impact on human survival, then it follows that you should be interested in doing the Hard Thing of helping people all over the world write less naively constructed reward functions to put into their future AIs.
John writes:
This pattern of outsourcing the Hard Part to the AI is definitely on display when it comes to 2 above. Academic AI/ML research also tends to ignore this Hard Part entirely, and implicitely outsources it to applied AI researchers, or even to the end users.