Seth Wieder comments on The idea that ChatGPT is simply “predicting” the next word is, at best, misleading

Seth Wieder 21 Feb 2023 21:02 UTC
1 point
1
There’s a lot of speculation for how these models operate. You specifically say “you don’t know” how it works, but are suggesting the idea it has some sort of planning phase.

As Wolfram explains, the Transformer architecture predicts one word at a time based on the previous inputs run through the model.

Any planning you think you see, is merely a trend based on common techniques for answering questions. The 5 sections of storytelling is an established technique that is commonly used in writing and thus embedded in the training of the model and seen in it’s responses.

In the future, these models could very well have planning phases—and more than next word prediction aligned with commons writing patterns.
- Bill Benzon 21 Feb 2023 21:11 UTC
  2 points
  0
  Parent
  If you look at the other comments I’ve made today you’ll see that I’ve revised my view somewhat.
  
  As for real planning, that’s certainly what Yann Lecun talked about in the white paper he uploaded last summer.