I suggest that it’s a straw man to claim that anyone has argued ‘the superintelligence wouldn’t understand what you wanted it to do, if you didn’t program it to fully understand that at the outset’. Do you have evidence that this is a position held by, say, anyone at MIRI?
MIRI assumes that programming what you want an AI to do at the outset , Big Design Up Front, is a desirable feature for some reason.
The most common argument is that it is a necessary prerequisite for provable correctness, which is a desirable safety feature. OTOH, the exact opposite of massive hardcoding, goal flexibility is ielf a necessary prerequisite for corrigibility, which is itself a desirable safety feature.
The latter point has not been argued against adequately, IMO.
MIRI assumes that programming what you want an AI to do at the outset , Big Design Up Front, is a desirable feature for some reason.
The most common argument is that it is a necessary prerequisite for provable correctness, which is a desirable safety feature. OTOH, the exact opposite of massive hardcoding, goal flexibility is ielf a necessary prerequisite for corrigibility, which is itself a desirable safety feature.
The latter point has not been argued against adequately, IMO.