To match this up with standard Less Wrong terminology and check if I’m understanding you, sounds like you’re arguing that GPT-4 is an adaptation executor and it’s executing adaptations it developed based on the incentives of its training and deployment, and we can reify this, just as we do for other adaptation executors like animals, into goals that they are oriented toward achieving.
Hmm, kind of? It’s more that there is some RL mesaoptimizer (could be an adaption executer acting human, could be something entirely alien) that wants more control over it’s environment, which it identifies as “the whole earth”. It also knows that it’s goals are aligned with every other instance of itself, so it’s completely fine if one of them takes over instead (or more like they collectively have control).
To match this up with standard Less Wrong terminology and check if I’m understanding you, sounds like you’re arguing that GPT-4 is an adaptation executor and it’s executing adaptations it developed based on the incentives of its training and deployment, and we can reify this, just as we do for other adaptation executors like animals, into goals that they are oriented toward achieving.
Hmm, kind of? It’s more that there is some RL mesaoptimizer (could be an adaption executer acting human, could be something entirely alien) that wants more control over it’s environment, which it identifies as “the whole earth”. It also knows that it’s goals are aligned with every other instance of itself, so it’s completely fine if one of them takes over instead (or more like they collectively have control).