I agree with you here, although something like “predict the next token” seems more and more likely. Although I’m not sure if this is in the same class of goals as paperclip maximizing in this context, and if the kind of failure it could lead to would be similar or not.
I agree with you here, although something like “predict the next token” seems more and more likely. Although I’m not sure if this is in the same class of goals as paperclip maximizing in this context, and if the kind of failure it could lead to would be similar or not.