The main difficulty is that you still need to translate between the formal language of computations and something humans can understand in practice (which probably means natural language). This is similar to Dialogic RL. So you still need an additional subsystem for making this translation, e.g. AQD. At which point you can ask, why not just apply AQD directly to a pivotal[1] action?
I’m not sure what the answer is. Maybe we should apply AQD directly, or maybe AQD is too weak for pivotal actions but good enough for translation. Or maybe it’s not even good enough for translation, in which case it’s back to the blueprints. (Similar considerations apply to other options like IDA.)
The main difficulty is that you still need to translate between the formal language of computations and something humans can understand in practice (which probably means natural language). This is similar to Dialogic RL. So you still need an additional subsystem for making this translation, e.g. AQD. At which point you can ask, why not just apply AQD directly to a pivotal[1] action?
I’m not sure what the answer is. Maybe we should apply AQD directly, or maybe AQD is too weak for pivotal actions but good enough for translation. Or maybe it’s not even good enough for translation, in which case it’s back to the blueprints. (Similar considerations apply to other options like IDA.)
More precisely, to something “quasipivotal” like “either give me a pivotal action or some improvement on the alignment protocol I’m using right now”.