Exactly, I think we agree on the claim/crux. Nate would probably say that for the relevant pivotal tasks/levels of intelligence, the AI needs some general-purpose threads that are capable enough as for cheating on humans to be easy. Maybe this is also affected by you thinking some early-training nudges (like grokking a good enough concept of deception) might stick (that is, high path dependence), while Nate expects eventually something sharp left turn-ish (some general enough partially-self-reflective threads) that supersedes those (low path dependence)?
At this point I certainly am not confident about anything. I find Nate’s view quite plausible, I just have enough uncertainty currently that I could also imagine it being false.
Exactly, I think we agree on the claim/crux. Nate would probably say that for the relevant pivotal tasks/levels of intelligence, the AI needs some general-purpose threads that are capable enough as for cheating on humans to be easy. Maybe this is also affected by you thinking some early-training nudges (like grokking a good enough concept of deception) might stick (that is, high path dependence), while Nate expects eventually something sharp left turn-ish (some general enough partially-self-reflective threads) that supersedes those (low path dependence)?
At this point I certainly am not confident about anything. I find Nate’s view quite plausible, I just have enough uncertainty currently that I could also imagine it being false.