Instrumental convergence only comes into play when there are free variables in action space which are optimized with respect to their consequences.
I roughly get what this is gesturing at, but I’m still a bit confused. Does anyone have any literature/posts they can point me at which may help explain?
Also great post janus! It has really updated my thinking about alignment.
To me this statement seems mostly tautological. Something is instrumental if it is helpful in bringing about some kind of outcome. The term “instrumental” is always (as far as I can tell) in reference to some sort of consequence based optimization.
I roughly get what this is gesturing at, but I’m still a bit confused. Does anyone have any literature/posts they can point me at which may help explain?
Also great post janus! It has really updated my thinking about alignment.
To me this statement seems mostly tautological. Something is instrumental if it is helpful in bringing about some kind of outcome. The term “instrumental” is always (as far as I can tell) in reference to some sort of consequence based optimization.