This is somewhat related: https://blog.openai.com/debate/
litvand
Karma: 7
“From a portfolio approach perspective, a particular research avenue is worthwhile if it helps to cover the space of possible reasonable assumptions. For example, while MIRI’s research is somewhat controversial, it relies on a unique combination of assumptions that other groups are not exploring, and is thus quite useful in terms of covering the space of possible assumptions.”
https://vkrakovna.wordpress.com/2017/08/16/portfolio-approach-to-ai-safety-research/
I agree that in the long term, agent AI could probably improve faster than CAIS, but I think CAIS could still be a solution.
Regardless of how it is aligned, aligned AI will tend to improve slower than unaligned AI, because it is trying to achieve a more complicated goal, human oversight takes time, etc. To prevent unaligned AI, aligned AI will need a head start, so it can stop any unaligned AI while it’s still much weaker. I don’t think CAIS is fundamentally different in that respect.
If the reasoning in the post that CAIS will develop before AGI holds up, then CAIS would actually have an advantage, because it would be easier to get a head start.