Oops, forgot about that. You’re right, he didn’t rule that out.
Is there a reason you don’t list his “A deeper solution” here? (Or did I miss it?) Because it trades off against capabilities? Or something else?
Mainly for brevity, but also because it seems to involve quite a drastic change in how the reward function/model as a whole functions. So it doesn’t seem particularly likely that it’ll be implemented.
Oops, forgot about that. You’re right, he didn’t rule that out.
Is there a reason you don’t list his “A deeper solution” here? (Or did I miss it?) Because it trades off against capabilities? Or something else?
Mainly for brevity, but also because it seems to involve quite a drastic change in how the reward function/model as a whole functions. So it doesn’t seem particularly likely that it’ll be implemented.