But you don’t condition your future output on your typo being correct, that’s what gpt is doing here. If it randomly makes a mistake that the text in its dataset wouldn’t contain, like mistakenly saying that a queen was captured, or it takes a mistaken step during a physics computation, when it thereafter tries to predict the next word, it still “thinks” that its past output is sampled from the distribution of human-chess-analysis or human-physics-problem-solving. On the human distribution if “the queen was captured” exists in the past prompt, then you can take it as fact that the queen was captured, but this is false for text sampled from LLMs.
To solve this problem you would need a very large dataset of mistakes made by LLMs, and their true continuations. You’d need to take all physics books ever written, intersperse them with LLM continuations, then have humans write the corrections to the continuations, like “oh, actually we made a mistake in the last paragraph, here is the correct way to relate pressure to temperature in this problem...”.
It doesn’t work (at least right now), when I tried making chatGPT play chess against stockfish by giving it positions in algebraic notation and telling it to output 2 paragraphs of chess analysis before making its move, it would make a nonsensical move, and if I prompted it with “is there a mistake in the past output?” Or “are you sure this move is legal?”, It doesn’t realize that anything is out of order. Only once I point out the error explicitly can it realise that it made one and rationalize an explanation for the error.
That is novel (and, in my opinion potentially important/scary) capability of GPT4. You can look at A_Posthuman comment below for details. I do expect it to work on chess, be interested if proven wrong. You mentioned chatGPT but it can’t do reflection on usable level. To be fair I don’t know if GPT4 capabilities are on useful level/only tweak away right now, and how far they can be pushed if they are (as in if it can self-improve to ASI), but for solving “curse” problem even weak reflection capabilities should suffice.
But you don’t condition your future output on your typo being correct, that’s what gpt is doing here. If it randomly makes a mistake that the text in its dataset wouldn’t contain, like mistakenly saying that a queen was captured, or it takes a mistaken step during a physics computation, when it thereafter tries to predict the next word, it still “thinks” that its past output is sampled from the distribution of human-chess-analysis or human-physics-problem-solving. On the human distribution if “the queen was captured” exists in the past prompt, then you can take it as fact that the queen was captured, but this is false for text sampled from LLMs.
To solve this problem you would need a very large dataset of mistakes made by LLMs, and their true continuations. You’d need to take all physics books ever written, intersperse them with LLM continuations, then have humans write the corrections to the continuations, like “oh, actually we made a mistake in the last paragraph, here is the correct way to relate pressure to temperature in this problem...”.
Don’t have to be humans any more, GTP4 can do this to itself.
It doesn’t work (at least right now), when I tried making chatGPT play chess against stockfish by giving it positions in algebraic notation and telling it to output 2 paragraphs of chess analysis before making its move, it would make a nonsensical move, and if I prompted it with “is there a mistake in the past output?” Or “are you sure this move is legal?”, It doesn’t realize that anything is out of order. Only once I point out the error explicitly can it realise that it made one and rationalize an explanation for the error.
That is novel (and, in my opinion potentially important/scary) capability of GPT4. You can look at A_Posthuman comment below for details. I do expect it to work on chess, be interested if proven wrong. You mentioned chatGPT but it can’t do reflection on usable level. To be fair I don’t know if GPT4 capabilities are on useful level/only tweak away right now, and how far they can be pushed if they are (as in if it can self-improve to ASI), but for solving “curse” problem even weak reflection capabilities should suffice.