Explicitly designing a heuristic for the time between reaching superhuman level and understanding friendliness might be worthwhile, but probably not worth speding resources on at this point.
ISTM that the main thing the AI needs to understand is that a large amount of optimization pressure has already been applied towards Friendliness-like goals; thus, random changes to the state of the world are likely to be bad.
ISTM that the main thing the AI needs to understand is that a large amount of optimization pressure has already been applied towards Friendliness-like goals; thus, random changes to the state of the world are likely to be bad.