Could we agree on a testable prediction of this theory? For example, looking at the chess degradation example. I think your argument predicts that if we play several games of chess against ChatGPT in a row, its performance will keep going down in later games, in terms of both quality and legality. Potentially such that the last attempt will be complete gibberish. Would that be a good test?
Certainly I would agree with that. In fact right now I can’t even get chatGPT to play a single game of chess (against stockfish)from start to finish without it at some point outputting an illegal move. I expect that future versions of GPT will be coherent for longer, but I don’t expect GPT to suddenly “get it” and be able to play legal and coherent chess for arbitrary length of sequences. (Google tells me that chess has a typical sequence length of about 40, so maybe Go would be a better choice with a typical number of moves per game in the 150). And certainly I don’t expect GPT to be able to play chess AND also write coherent chess commentary between each move, since that would greatly increase the timescale of required coherence.
Could we agree on a testable prediction of this theory? For example, looking at the chess degradation example. I think your argument predicts that if we play several games of chess against ChatGPT in a row, its performance will keep going down in later games, in terms of both quality and legality. Potentially such that the last attempt will be complete gibberish. Would that be a good test?
Certainly I would agree with that. In fact right now I can’t even get chatGPT to play a single game of chess (against stockfish) from start to finish without it at some point outputting an illegal move. I expect that future versions of GPT will be coherent for longer, but I don’t expect GPT to suddenly “get it” and be able to play legal and coherent chess for arbitrary length of sequences. (Google tells me that chess has a typical sequence length of about 40, so maybe Go would be a better choice with a typical number of moves per game in the 150). And certainly I don’t expect GPT to be able to play chess AND also write coherent chess commentary between each move, since that would greatly increase the timescale of required coherence.