The most salient example of this is when you try to make chatGPT play chess and write chess analysis. At some point, it will make a mistake and write something like “the queen was captured” when in fact the queen was not captured. This is not the kind of mistake that chess books make, so it truly takes it out of distribution. What ends up happening is that GPT conditions its future output on its mistake being correct, which takes it even further outside the distribution of human text, until this diverges into nonsensical moves.
Is this a limitation in practice? Rap Battles are a bad example because they happen to be the exception of a task premised on being “one shot” and real time, but the overall point stands. We ask GPT to do tasks in one try, one step, that humans do with many steps, iteratively and recursively.
Take this “the queen was captured” problem. As a human I might be analyzing a game, glance at the wrong move, think a thought about the analysis premised on that move (or even start writing words down!) and then notice the error and just fix it. I am doing this right now, in my thoughts and on the keyboard, writing this comment.
Same thing works with ChatGPT, today. I deal with problems like “the queen was captured” every day just by adding more ChatGPT steps. Instead of one-shotting, every completion chains a second ChatGPT prompt to check for mistakes. (You may need a third level to get to like 99% because the checker blunders too.) The background chains can either ask to regenerate the original prompt, or reply to the original ChatGPT describing the error, and ask it to fix its mistake. The latter form seems useful for code generation.
Like right now I typically do 2 additional background chains by default, for every single thing I ask Chat GPT. Not just in a task where I’m seeking rigour and want to avoid factual mistakes like “the queen was captured” but just to get higher quality responses in general.
Original Prompt → Improve this answer. → Improve this Answer.
Not literally just those three words, but even something that simple is actually better than just asking one time. Seriously. Try it, confirm, and make it a habit. Sometimes it’s shocking. I ask for a simple javascript function, it pumps out a 20 line function that looks fine to me. I habitually ask for a better version and “Upon reflection, you can do this in two lines of javascript that run 100x faster.”
If GPT were 100x cheaper I would be tempted just go wild with this. Every prompt is 200 or 300 prompts in the background, invisibly, instead of 2 or 3. I’m sure there’s diminishing returns and the chain would be more complicated than repeating “Improve” 100 times, but it were fast and cheap enough, why not do it.
As an aside, I think about asking ChatGPT to write code like asking a human to code a project on a whiteboard without the internet to find answers, a computer to run code on, or even paper references. The human can probably do it, sort of, but I bet the code will have tons of bugs and errors and even API ‘hallucinations’ if you run it! I think it’s even worse than that, it’s almost like ChatGPT isn’t even allowed to erase anything it wrote on white board either. But we don’t need to one shot everything, so do we care about infinite length completions? Humans do things in steps, and when ChatGPT isn’t trying to whiteboard everything, when it can check API references, when it can see what the code returns, errors, when it can recurse on itself to improve things, it’s way better. Right now the form this takes is a human on the ChatGPT web page asking for code, running it, and then pasting the error message back into ChatGPT. The more automated versions of this are trickling out. Then I imagine the future, asking ChatGPT for code when its 1000x cheaper. And my one question behind the scenes is actually 1000 prompts looking up APIs on the internet, running the code in a simulator (or for real, people are already doing that) looking at the errors or results, etc. And that’s the boring unimaginative extrapolation.
Also this is probably obvious, but just in case: if you try asking “Improve this answer.” repeatedly in ChatGPT you need to manage your context window size. Migrate to a new conversation when you get about 75% full. OpenAI should really warn you because even before 100% the quality drops like a rock. Just copy your original request and the last best answer(s). If you’re doing it manually select a few useful other bits too.
I think you’ve had more luck than me when trying to get chatGPT to correct its own mistakes. When I tried making it play chess, I told it to “be sure not to output your move before writing a paragraph of analysis on the current board position, and output 5 good moves and the reasoning behind them, all of this before giving me your final move.” Then after it chose its move I told it “are you sure this is a legal move? and is this really the best move?”, it pretty much never changed its answer, and never managed to figure out that its illegal moves were illegal. If I straight-up told it “this move is illegal”, it would excuse itself and output something else, and sometimes it correctly understood why its move was illegal, but not always.
so do we care about infinite length completions?
The inability of the GPT series to generate infinite length completions is crucial for safety! If humans fundamentally need to be in the loop for GPT to give us good outputs for things like scientific reasoning, then it makes the whole thing suddenly way safer, and we can be assured that there isn’t an instance of GPT running on some amazon server self-improving itself by just doing a thousand years of scientific progress in a week.
Does the inability of the GPT series to generate infinite length completions require that humans specifically remain in the loop, or just that the external world must remain in the loop in some way which gets the model back into the distribution? Because if it’s the latter case I think you still have to worry about some instance running on a cloud server somewhere.
Is this a limitation in practice? Rap Battles are a bad example because they happen to be the exception of a task premised on being “one shot” and real time, but the overall point stands. We ask GPT to do tasks in one try, one step, that humans do with many steps, iteratively and recursively.
Take this “the queen was captured” problem. As a human I might be analyzing a game, glance at the wrong move, think a thought about the analysis premised on that move (or even start writing words down!) and then notice the error and just fix it. I am doing this right now, in my thoughts and on the keyboard, writing this comment.
Same thing works with ChatGPT, today. I deal with problems like “the queen was captured” every day just by adding more ChatGPT steps. Instead of one-shotting, every completion chains a second ChatGPT prompt to check for mistakes. (You may need a third level to get to like 99% because the checker blunders too.) The background chains can either ask to regenerate the original prompt, or reply to the original ChatGPT describing the error, and ask it to fix its mistake. The latter form seems useful for code generation.
Like right now I typically do 2 additional background chains by default, for every single thing I ask Chat GPT. Not just in a task where I’m seeking rigour and want to avoid factual mistakes like “the queen was captured” but just to get higher quality responses in general.
Original Prompt → Improve this answer. → Improve this Answer.
Not literally just those three words, but even something that simple is actually better than just asking one time. Seriously. Try it, confirm, and make it a habit. Sometimes it’s shocking. I ask for a simple javascript function, it pumps out a 20 line function that looks fine to me. I habitually ask for a better version and “Upon reflection, you can do this in two lines of javascript that run 100x faster.”
If GPT were 100x cheaper I would be tempted just go wild with this. Every prompt is 200 or 300 prompts in the background, invisibly, instead of 2 or 3. I’m sure there’s diminishing returns and the chain would be more complicated than repeating “Improve” 100 times, but it were fast and cheap enough, why not do it.
As an aside, I think about asking ChatGPT to write code like asking a human to code a project on a whiteboard without the internet to find answers, a computer to run code on, or even paper references. The human can probably do it, sort of, but I bet the code will have tons of bugs and errors and even API ‘hallucinations’ if you run it! I think it’s even worse than that, it’s almost like ChatGPT isn’t even allowed to erase anything it wrote on white board either. But we don’t need to one shot everything, so do we care about infinite length completions? Humans do things in steps, and when ChatGPT isn’t trying to whiteboard everything, when it can check API references, when it can see what the code returns, errors, when it can recurse on itself to improve things, it’s way better. Right now the form this takes is a human on the ChatGPT web page asking for code, running it, and then pasting the error message back into ChatGPT. The more automated versions of this are trickling out. Then I imagine the future, asking ChatGPT for code when its 1000x cheaper. And my one question behind the scenes is actually 1000 prompts looking up APIs on the internet, running the code in a simulator (or for real, people are already doing that) looking at the errors or results, etc. And that’s the boring unimaginative extrapolation.
Also this is probably obvious, but just in case: if you try asking “Improve this answer.” repeatedly in ChatGPT you need to manage your context window size. Migrate to a new conversation when you get about 75% full. OpenAI should really warn you because even before 100% the quality drops like a rock. Just copy your original request and the last best answer(s). If you’re doing it manually select a few useful other bits too.
I think you’ve had more luck than me when trying to get chatGPT to correct its own mistakes. When I tried making it play chess, I told it to “be sure not to output your move before writing a paragraph of analysis on the current board position, and output 5 good moves and the reasoning behind them, all of this before giving me your final move.” Then after it chose its move I told it “are you sure this is a legal move? and is this really the best move?”, it pretty much never changed its answer, and never managed to figure out that its illegal moves were illegal. If I straight-up told it “this move is illegal”, it would excuse itself and output something else, and sometimes it correctly understood why its move was illegal, but not always.
The inability of the GPT series to generate infinite length completions is crucial for safety! If humans fundamentally need to be in the loop for GPT to give us good outputs for things like scientific reasoning, then it makes the whole thing suddenly way safer, and we can be assured that there isn’t an instance of GPT running on some amazon server self-improving itself by just doing a thousand years of scientific progress in a week.
Does the inability of the GPT series to generate infinite length completions require that humans specifically remain in the loop, or just that the external world must remain in the loop in some way which gets the model back into the distribution? Because if it’s the latter case I think you still have to worry about some instance running on a cloud server somewhere.