Right now I think that section about pre-trained models is simply wrong. RLHF/finetuning basically don’t create new capabilies, they just rescale relative power of different algorithms implemented on pretraining stage. If base model doesn’t have elemens corresponding to situational awareness and long-term goals that means base model is not very smart in the first place and unlikely to become TAI.
Right now I think that section about pre-trained models is simply wrong. RLHF/finetuning basically don’t create new capabilies, they just rescale relative power of different algorithms implemented on pretraining stage. If base model doesn’t have elemens corresponding to situational awareness and long-term goals that means base model is not very smart in the first place and unlikely to become TAI.