A human researcher would see all of the AI’s code and the “pill” (the proposed change), yet even without that element of “chance” it is not yet a solved problem what the change would end up doing.
If the first human-programmed foom-able AI is not yet orders of magnitude smarter than a human—and it’s doubtful it would be, given that it’s still human-designed, then the AI would have no advantage in understanding its own code that the human researcher wouldn’t have.
If the human researcher cannot yet solve keeping the utility function steady under modifications, why should the similar-magnitude-of-intelligence AI (both have full access to the code-base)?
Just remember that it’s the not-yet-foomed AI that has to deal with these issues, before it can go weeeeeeeeeeeeeeeeKILLHUMANS (foom).
edit: More relevant reply:
A human researcher would see all of the AI’s code and the “pill” (the proposed change), yet even without that element of “chance” it is not yet a solved problem what the change would end up doing.
If the first human-programmed foom-able AI is not yet orders of magnitude smarter than a human—and it’s doubtful it would be, given that it’s still human-designed, then the AI would have no advantage in understanding its own code that the human researcher wouldn’t have.
If the human researcher cannot yet solve keeping the utility function steady under modifications, why should the similar-magnitude-of-intelligence AI (both have full access to the code-base)?
Just remember that it’s the not-yet-foomed AI that has to deal with these issues, before it can go weeeeeeeeeeeeeeeeKILLHUMANS (foom).