I don’t think this sort of prompt actually gets at the conscious reasoning gap. It only takes one attention head to copy the exact next token prediction made at a previous token, and I’d expect if you used few shot prompting (especially filling the entire context with few shot prompts), it would use its induction-like heads to just copy its predictions and perform quite well.
A better example would be to have the model describe its reasoning about predicting the next token, and then pass that to itself in an isolated prompt to predict the next token.
I don’t think this sort of prompt actually gets at the conscious reasoning gap. It only takes one attention head to copy the exact next token prediction made at a previous token, and I’d expect if you used few shot prompting (especially filling the entire context with few shot prompts), it would use its induction-like heads to just copy its predictions and perform quite well.
A better example would be to have the model describe its reasoning about predicting the next token, and then pass that to itself in an isolated prompt to predict the next token.