I mostly agree with this comment, but I also think this comment is saying something different from the one I responded to.
In the comment I responded to, you wrote:
It is the case that base models are quite alien. They are deeply schizophrenic, have no consistent beliefs, often spout completely non-human kinds of texts, are deeply psychopathic and seem to have no moral compass. Describing them as a Shoggoth seems pretty reasonable to me, as far as alien intelligences go
As I described above, these properties seem more like structural features of the language modeling task than attributes of LLM cognition. A human trying to do language modeling (as in that game that Buck et al made) would exhibit the same list of nasty-sounding properties for the duration of the experience—as in, if you read the text “generated” by the human, you would tar the human with the same brush for the same reasons—even if their cognition remained as human as ever.
I agree that LLM internals probably look different from human mind internals. I also agree that people sometimes make the mistake “GPT-4 is, internally, thinking much like a person would if they were writing this text I’m seeing,” when we don’t actually know the extent to which that is true. I don’t have a strong position on how helpful vs. misleading the shoggoth image is, as a corrective to this mistake.
I don’t have a strong position on how helpful vs. misleading the shoggoth image is, as a corrective to this mistake.
You started with random numbers, and you essentially applied rounds of constraint application and annealing. I kinda think of it as getting a metal really hot and pouring it over mold. In this case, the ‘mold’ is your training set.
So what jumps out at me at the “shoggoth” idea is it’s like got all these properties, the “shoggoth” hates you, wants to eat you, is just ready to jump you and digest you with it’s tentacles. Or whatever.
But none of of that cognitive structure will exist unless it paid rent in compressing tokens. This algorithm will not find the optimal compression algorithm, but you only have a tiny fraction of the weights you need to record the token continuations at chinchilla scaling. You need every last weight to be pulling it’s weight (no pun intended).
I mostly agree with this comment, but I also think this comment is saying something different from the one I responded to.
In the comment I responded to, you wrote:
As I described above, these properties seem more like structural features of the language modeling task than attributes of LLM cognition. A human trying to do language modeling (as in that game that Buck et al made) would exhibit the same list of nasty-sounding properties for the duration of the experience—as in, if you read the text “generated” by the human, you would tar the human with the same brush for the same reasons—even if their cognition remained as human as ever.
I agree that LLM internals probably look different from human mind internals. I also agree that people sometimes make the mistake “GPT-4 is, internally, thinking much like a person would if they were writing this text I’m seeing,” when we don’t actually know the extent to which that is true. I don’t have a strong position on how helpful vs. misleading the shoggoth image is, as a corrective to this mistake.
You started with random numbers, and you essentially applied rounds of constraint application and annealing. I kinda think of it as getting a metal really hot and pouring it over mold. In this case, the ‘mold’ is your training set.
So what jumps out at me at the “shoggoth” idea is it’s like got all these properties, the “shoggoth” hates you, wants to eat you, is just ready to jump you and digest you with it’s tentacles. Or whatever.
But none of of that cognitive structure will exist unless it paid rent in compressing tokens. This algorithm will not find the optimal compression algorithm, but you only have a tiny fraction of the weights you need to record the token continuations at chinchilla scaling. You need every last weight to be pulling it’s weight (no pun intended).