Nanda Ale comments on Agentized LLMs will change the alignment landscape

Nanda Ale 9 Apr 2023 11:59 UTC
15 points
7
I’m not confident at all Auto-GPT could work at its goals, just that in narrower domains the specific system or arrangement of prompt interactions matters. To give a specific example, I goof around trying to get good longform D&D games out of ChatGPT. (Even GPT-2 fine-tuned on Crit Role transcripts, originally.) Some implementations just work way better than others.
The trivial system is no system—just play D&D. Works great until it feels like the DM is the main character in Memento. The trivial next step, rolling context window. Conversation fills up, ask for summary, start a new conversation with the summary. Just that is a lot better. But you really feel loss of detail in the sudden jump, so why not make it continuous. A secretary GPT with one job, prune the DM GPT conversation text after every question and answer, always try to keep most important and most recent. Smoother than the summary system. Maybe the secretary can not just delete but keep some details instead, maybe use half its tokens for a permanent game-state. Then it can edit useful details in/out of the conversation history. Can the secretary write a text file for old conversations? Etc. etc.
Maybe the difference is the user plays the D&D, so you know immediately when it’s not working well. It’s usually obvious in minutes. Auto-GPT is supposed to automatic. So they add features and just kind of hope the AI figures it out from there. They don’t get the immediate “this is not working at all” feedback. Like they added embeddings 5 days ago—it just prints the words “Permanent memory:” in the prompt, followed by giant blogs up to 2500 tokens of the most related text from Pinecone. Works great for chatbots answering a single question about technical documentation. Real easy to imagine how it could fall apart when does iteratively over longer time periods. I can’t imagine this would work for a D&D game, it might be worse than having no memory. My gut feeling is you pull in the 2500 most related tokens of content into your prompt and the system is overall more erratic. You get the wrong 2500 tokens, it overwhelms whatever the original prompt was, now what is your agent up to? Just checked now, it changed to “This reminds you of these events from your past:”. That might actually make it somewhat less likely to blow up. Basically making the context of the text more clear: “These are old events and thoughts, and you are reminded of them, don’t take this text too seriously, this text might not even be relevant so maybe you should even ignore it. It’s just some stuff that came to mind, that’s how memories work sometimes.”