RogerDearnaley comments on Current AIs Provide Nearly No Data Relevant to AGI Alignment

RogerDearnaley 16 Dec 2023 6:00 UTC
3 points
0
If you doubt that transformer models are simulators, why was DeepMind so successful in using them for predicting weather patterns? Why have they been so successful for many other sequence prediction tasks? I suggest you read up on some of the posts under Simulator Theory, which explain this better and at more length than I can in this comment thread.
On them being superhuman at predicting tokens — yes, absolutely. What’s your point? The capabilities of the agents simulated are capped by the computational complexity of the simulator, but not vice-versa. If you take the architecture and computational power needed to run GPT-10 and use it to train a base model only on (enough) text from humans with IQ <80, then the result will do an amazing, incredibly superhumanly accurate job of simulating the token-generation behavior of humans with an IQ <80.
The cognitive architecture of an LLM is very different from that of a person, and it is a mistake IMO to believe we can use our knowledge of human behavior to reason about an LLM.
If you want to reason about a transformer model, you should be using learning theory, SLT, compression, and so forth. However, what those tell us is basically that (within the limits of their capacity and training data) transformers run good simulations. So if you train them to simulate humans, then (to the extent that the simulation is accurate) human psychology applies, and thus things like EmotionPrompts work. So LLM-simulated humans make human-like mistakes when they’re being correctly simulated, plus also very un-human-like (to us dumb looking) mistakes when the simulation is inaccurate.
So our knowledge of human behavior is useful, but I agree is not sufficient, to reason about an LLM running a simulation of human.