Bill Benzon comments on The idea that ChatGPT is simply “predicting” the next word is, at best, misleading

Bill Benzon 21 Feb 2023 21:07 UTC
1 point
0
Thanks for reminding me that training uses inference.

As for ChatGPT having a global plan, as you can see if you look at the comments I’ve made earlier today, I have come around to that view. The people that wrote the stories ChatGPT consumed during training, they had plans, and those plans are reflected in the stories they wrote. That structure is “smeared” over all those parameters weights and gets “reconstructed” each time ChatGPT generates a new token.

In his last book, The Computer and the Brain, John von Neumann noted, quite correctly, that each neuron is both a memory store and a processor. Subsequent research has made it clear that the brain stores specific things – objects, events, plans, whatever ¬– in populations of neurons, not individual neurons. These populations operate in parallel.

We don’t yet have the luxury of such processors so we have to make do with programming a virtual neural net to run on a processor having way more memory units than processing units. And so our virtual machine has to visit each memory unit every time it makes one step in its virtual computation.
- james wolf 22 Feb 2023 3:59 UTC
  1 point
  1
  Parent
  It does seem like there are “plans” or formats in place, not just choosing the next best word.
  
  When it creates a resume , or a business plan or timeline, it seems much more likely that there is some form of structure that it’s is using and a template and then choosing the words that would go best in there correct places.
  
  Stories have a structure , beginning middle end. So it’s not just picking words it’s picking the words that go best with a beginning then the words that go best middle and then end. If it was just choosing next words you could imagine it being a little more creative and less formulaic.
  
  This model was trained by humans , who told it when it had the structure right , and the weights got placed heavier where it conformed to the right preexisting plan. So if any thing the “neural” pathways that formed the strongest connections are ones that 1. Resulted in the best use of tokens 2. Were weighted deliberately higher by the human trainers