I think we could play an endless and uninteresting game of “find a real-world example for / against factorization.”
The problem with not using existing real-world examples as a primary evidence source is that we have far more bits-of-evidence from the existing real world, at far lower cost, than from any other source. Any method which doesn’t heavily leverage those bits necessarily makes progress at a pace orders of magnitude slower.
Also, in order for factorization to be viable for aligning AI, we need the large majority of real-world cognitive problems to be factorizable. So if we can find an endless stream of real-world examples of cognitive problems which humans are bad at factoring, then this class of approaches is already dead in the water.
I think we could play an endless and uninteresting game of “find a real-world example for / against factorization.”
To me, the more interesting discussion is around building better systems for updating on alignment research progress -
What would it look like for this research community to effectively update on results and progress?
What can we borrow from other academic disciplines? E.g. what would “preregistration” look like?
What are the ways more structure and standardization would be limiting / taking us further from truth?
What does the “institutional memory” system look like?
How do we coordinate the work of different alignment researchers and groups to maximize information value?
The problem with not using existing real-world examples as a primary evidence source is that we have far more bits-of-evidence from the existing real world, at far lower cost, than from any other source. Any method which doesn’t heavily leverage those bits necessarily makes progress at a pace orders of magnitude slower.
Also, in order for factorization to be viable for aligning AI, we need the large majority of real-world cognitive problems to be factorizable. So if we can find an endless stream of real-world examples of cognitive problems which humans are bad at factoring, then this class of approaches is already dead in the water.