Also I don’t think that LLMs have “hidden internal intelligence”, given e.g LLMs trained on “A is B” fail to learn “B is A”
I’m not quite sure what you mean by “hidden internal intelligence”, but if you mean “quite alien abilities and cognitive processes”, then I disagree and think it’s quite likely that SOTA LLMs have this. If you instead mean something like “an inner homunculus reasoning about what to simulate”, then I totally agree that LLMs very likely don’t have this. (Though I don’t see how the reversal curse provides much evidence either way on either of these claims.)
I think it’s pretty likely that there are many cases where LLMs are notably superhuman in some way. For instance, I think that LLMs are wildly superhuman at next token prediction and generally I think base models have somewhat alien intelligence profiles (which is perhaps dropped to some extent in current RLHF’d chatbots).
These superhuman abilities are probably non-trivial to directly use, but might be possible to elicit with some effort (though it’s unclear if these abilities are very important or very useful for anything we care about).
If you instead mean something like “an inner homunculus reasoning about what to simulate”, then I totally agree that LLMs very likely don’t have this
Yeah, I meant something like this. The reversal curse is evidence because if most output was controlled by “inner beings”, presumably they’d be smart enough to “remember” the reversal.
It’s very strange conclusion. I certainly find easier to recall “word A in foreign language means X” than reversal. If homunculus simulated me (or vast majority of humans), it would create multiple instances of reversal curse.
Distant philosophical example: my brain is smart enough to control my body, but I definitely can’t use its knowledge to create humanoid robots from scratch.
I’m not a simulator enthusiast, but I find your reasoning kinda sloppy.
I like this comment and agree overall.
But, I do think I have one relevant disagreement:
I’m not quite sure what you mean by “hidden internal intelligence”, but if you mean “quite alien abilities and cognitive processes”, then I disagree and think it’s quite likely that SOTA LLMs have this. If you instead mean something like “an inner homunculus reasoning about what to simulate”, then I totally agree that LLMs very likely don’t have this. (Though I don’t see how the reversal curse provides much evidence either way on either of these claims.)
I think it’s pretty likely that there are many cases where LLMs are notably superhuman in some way. For instance, I think that LLMs are wildly superhuman at next token prediction and generally I think base models have somewhat alien intelligence profiles (which is perhaps dropped to some extent in current RLHF’d chatbots).
These superhuman abilities are probably non-trivial to directly use, but might be possible to elicit with some effort (though it’s unclear if these abilities are very important or very useful for anything we care about).
Yeah, I meant something like this. The reversal curse is evidence because if most output was controlled by “inner beings”, presumably they’d be smart enough to “remember” the reversal.
It’s very strange conclusion. I certainly find easier to recall “word A in foreign language means X” than reversal. If homunculus simulated me (or vast majority of humans), it would create multiple instances of reversal curse.
Distant philosophical example: my brain is smart enough to control my body, but I definitely can’t use its knowledge to create humanoid robots from scratch.
I’m not a simulator enthusiast, but I find your reasoning kinda sloppy.