I think such a prize would be more constructive, if it could also just reward demonstrations of the difficulty of AI alignment. An outright proof of impossibility is very unlikely in my opinion, but better arguments for the danger of unaligned AI and the difficulty of aligning it, seem very possible.
Yes, surely the proof would be very difficult or impossible. However, enough people have the nagging worry that it is impossible to justify the effort to see if we can prove that it is impossible...and update.
But, if the effort required for a proof is—I don’t know − 120 person months—let’s please, Humanity, not walk right past that one into the blades.
I am not advocating that we divert dozens of people from promising alignment work.
Even if it failed, I would hope the prove-impossibility effort would throw off beneficial by-products like:
the alignment difficulty demonstrations Mitchell_Porter raised,
the paring of some alignment paths to save time,
new, promising alignment paths.
_____
I thought there was a 60%+ chance I would get a quick education on the people who are trying or who have tried to prove impossibility.
But, I also thought, perhaps this is one of those those Nate Soares blind spots...maybe caused by the fact that those who understand the issues are the types who want to fix.
I think such a prize would be more constructive, if it could also just reward demonstrations of the difficulty of AI alignment. An outright proof of impossibility is very unlikely in my opinion, but better arguments for the danger of unaligned AI and the difficulty of aligning it, seem very possible.
Yes, surely the proof would be very difficult or impossible. However, enough people have the nagging worry that it is impossible to justify the effort to see if we can prove that it is impossible...and update.
But, if the effort required for a proof is—I don’t know − 120 person months—let’s please, Humanity, not walk right past that one into the blades.
I am not advocating that we divert dozens of people from promising alignment work.
Even if it failed, I would hope the prove-impossibility effort would throw off beneficial by-products like:
the alignment difficulty demonstrations Mitchell_Porter raised,
the paring of some alignment paths to save time,
new, promising alignment paths.
_____
I thought there was a 60%+ chance I would get a quick education on the people who are trying or who have tried to prove impossibility.
But, I also thought, perhaps this is one of those those Nate Soares blind spots...maybe caused by the fact that those who understand the issues are the types who want to fix.
Has it gotten the attention it needs?
Wonder if we can assign a complexity class to the alignment problem? Even just proving that it’s an NP problem would be huge.