David Scott Krueger (formerly: capybaralet) comments on capybaralet’s Shortform

David Scott Krueger (formerly: capybaralet) 5 Sep 2022 15:40 UTC
3 points
I find the argument that ‘predicting data generated by agents (e.g. language modeling) will lead a model to learn / become an agent’ much weaker than I used to.

This is because I think it only goes through cleanly if the task uses the same input and output as the agent. This is emphatically not the case for (e.g.) GPT-3.