Schemes for taking multiple unaligned AIs and trying to build an aligned system out of the whole
I think this is just not possible.
Schemes for taking aligned but less powerful AIs and leveraging them to align a more powerful AI (possibly with amplification involved)
This breaks if there are cases where supervising is harder than generating, or if there is a discontinuity. I think it’s plausible something like this could work but I’m not super convinced.
The following things are not the same:
Schemes for taking multiple unaligned AIs and trying to build an aligned system out of the whole
I think this is just not possible.
Schemes for taking aligned but less powerful AIs and leveraging them to align a more powerful AI (possibly with amplification involved)
This breaks if there are cases where supervising is harder than generating, or if there is a discontinuity. I think it’s plausible something like this could work but I’m not super convinced.