Maybe everyone that discusses LMA alignment does already think about the prompting portion of alignment. In that case, this post is largely redundant. You think about LMA alignment a lot; I’m not sure everyone has as clear a mental model.
The remainder of your response points to a bifurcation in mental models that I should clarify in future work on LMAs. I am worried about and thinking about competent, agentic full AGI built as a language model cognitive architecture. I don’t think good terminology exists. When I use the term language model agent, I think it evokes an image of something like current agents that is not reflective, with a persistent memory and therefore a more persistent identity.
This is my threat model because I think it’s the easiest path to highly capable AGI. I think a model without those properties is shackled; the humans that created its “thought” dataset have an episodic memory as well as the semantic memory and working memory/context that the language model has. Using those thoughts without episodic memory is not using them as they were made to be used. And episodic memory is easy to implement, and leads naturally to persistent self-created beliefs, including goals and identity.
Maybe everyone that discusses LMA alignment does already think about the prompting portion of alignment. In that case, this post is largely redundant. You think about LMA alignment a lot; I’m not sure everyone has as clear a mental model.
The remainder of your response points to a bifurcation in mental models that I should clarify in future work on LMAs. I am worried about and thinking about competent, agentic full AGI built as a language model cognitive architecture. I don’t think good terminology exists. When I use the term language model agent, I think it evokes an image of something like current agents that is not reflective, with a persistent memory and therefore a more persistent identity.
This is my threat model because I think it’s the easiest path to highly capable AGI. I think a model without those properties is shackled; the humans that created its “thought” dataset have an episodic memory as well as the semantic memory and working memory/context that the language model has. Using those thoughts without episodic memory is not using them as they were made to be used. And episodic memory is easy to implement, and leads naturally to persistent self-created beliefs, including goals and identity.