Is Eliezer’s claim that it is impossible for a perfect reasoner to deceive themself, or that it is impossible for real-life humans to deceive themselves?
I assume he doesn’t argue that crazy people can’t deceive themselves. But then where is the boundary between crazy and perfect? And if the claim only applies to perfect reasoners, of what use is it?
Oh no, I am claiming that even a perfect reasoner can deceive himself. A normal person can easily do so. Many people who marry someone of a different faith become quite devout in their spouse’s religion. At some point they have to decide to believe something they don’t actually believe. It does not take a superintelligent AI to convince them, a local cleric can do it.
Is Eliezer’s claim that it is impossible for a perfect reasoner to deceive themself, or that it is impossible for real-life humans to deceive themselves?
I assume he doesn’t argue that crazy people can’t deceive themselves. But then where is the boundary between crazy and perfect? And if the claim only applies to perfect reasoners, of what use is it?
Oh no, I am claiming that even a perfect reasoner can deceive himself. A normal person can easily do so. Many people who marry someone of a different faith become quite devout in their spouse’s religion. At some point they have to decide to believe something they don’t actually believe. It does not take a superintelligent AI to convince them, a local cleric can do it.