My basic take on this question is “that’s doubtful (that humanity will be able to pull off such a thing in the relevant timeframes)”. It seems to me that making a system “deferential all the way down” would require a huge feat of mastery of AI internals that we’re nowhere close to.
We build deferential systems all the time and seem to be pretty good at it. For example, nearly 100% of the individuals in the US military are capable of killing Joe Biden (mandatory retirement age for the military is 62). But nonetheless Joe Biden is the supreme commander of the US armed forces.
We build deferential systems all the time and seem to be pretty good at it. For example, nearly 100% of the individuals in the US military are capable of killing Joe Biden (mandatory retirement age for the military is 62). But nonetheless Joe Biden is the supreme commander of the US armed forces.