TurnTrout comments on Simulators

TurnTrout 9 Jan 2024 23:39 UTC
LW: 10 AF: 8
−8
AF
I find this post fairly uninteresting, and feel irritated when people confidently make statements about “simulacra.” One problem is, on my understanding, that it doesn’t really reduce the problem of how LLMs work. “Why did GPT-4 say that thing?” “Because it was simulating someone who was saying that thing.” It does postulate some kind of internal gating network which chooses between the different “experts” (simulacra), so it isn’t contentless, but… Yeah.
Also I don’t think that LLMs have “hidden internal intelligence”, given e.g LLMs trained on “A is B” fail to learn “B is A”. Big evidence against the simulators hypothesis. And I don’t think people have nearly enough evidence to be going around talking about “what is the LLM simulating”, unless this is some really loose metaphor, in which case it should be marked as such.
I also think it isn’t useful to think of LLMs as “simulating stuff” or having a “shoggoth” or whatever. I think that can often give a false sense of understanding.
However, I think this post did properly call out the huge miss of earlier speculation about oracles and agents and such.
What links here?
- Voting Results for the 2022 Review by Ben Pace (2 Feb 2024 20:34 UTC; 57 points)
- habryka's comment on The LessWrong 2022 Review: Review Phase by RobertM (10 Jan 2024 22:04 UTC; 17 points)
- ryan_greenblatt 10 Jan 2024 0:11 UTC
  LW: 9 AF: 4
  2
  AF Parent
  I like this comment and agree overall.
  
  But, I do think I have one relevant disagreement:
  
  Also I don’t think that LLMs have “hidden internal intelligence”, given e.g LLMs trained on “A is B” fail to learn “B is A”
  
  I’m not quite sure what you mean by “hidden internal intelligence”, but if you mean “quite alien abilities and cognitive processes”, then I disagree and think it’s quite likely that SOTA LLMs have this. If you instead mean something like “an inner homunculus reasoning about what to simulate”, then I totally agree that LLMs very likely don’t have this. (Though I don’t see how the reversal curse provides much evidence either way on either of these claims.)
  
  I think it’s pretty likely that there are many cases where LLMs are notably superhuman in some way. For instance, I think that LLMs are wildly superhuman at next token prediction and generally I think base models have somewhat alien intelligence profiles (which is perhaps dropped to some extent in current RLHF’d chatbots).
  
  These superhuman abilities are probably non-trivial to directly use, but might be possible to elicit with some effort (though it’s unclear if these abilities are very important or very useful for anything we care about).
  - TurnTrout 15 Jan 2024 22:24 UTC
    LW: 4 AF: 4
    0
    AF Parent
    If you instead mean something like “an inner homunculus reasoning about what to simulate”, then I totally agree that LLMs very likely don’t have this
    Yeah, I meant something like this. The reversal curse is evidence because if most output was controlled by “inner beings”, presumably they’d be smart enough to “remember” the reversal.
    - quetzal_rainbow 16 Jan 2024 5:39 UTC
      3 points
      2
      Parent
      It’s very strange conclusion. I certainly find easier to recall “word A in foreign language means X” than reversal. If homunculus simulated me (or vast majority of humans), it would create multiple instances of reversal curse.
      
      Distant philosophical example: my brain is smart enough to control my body, but I definitely can’t use its knowledge to create humanoid robots from scratch.
      
      I’m not a simulator enthusiast, but I find your reasoning kinda sloppy.
- the gears to ascension 10 Jan 2024 18:32 UTC
  5 points
  2
  Parent
  I agree that this is a somewhat dated post. Janus has said similarly and I’ve encouraged them to edit the intro to say “yall shouldn’t have been impressed by this” or something. with that said, some very weak defenses of a couple of specific things:
  
  having a “shoggoth”
  
  the way to ground that reasonably is that the shoggoth is the hypersurfaces of decision boundary enclosed volumes. it’s mainly useful as a metaphor if it works as a way to translate into english the very basic idea that neural networks are function approximators. a lot of metaphorical terms are in my view attempts (which generally don’t succeed, especially, it seems, for you) to convey that neural networks are, fundamentally, just adjustable high dimensional kaleidoscopes.
  
  it isn’t contentless
  
  it’s not trying to be highly contentful, it’s trying to clarify a bunch of people’s wrong intuitions about the very basics of what is even happening. If you already grok how taking the derivative of cross entropy of two sequences requires a language model to approximate a function which compresses towards the data’s entropy floor, then the idea that the model “learns to simulate” is far too vague and inspecific. but if you didn’t already grok why that math is what we use to define how well the model is performing at its task, then it might not be obvious what that task is, and calling it a “simulator” helps clarify the task.
  
  that can often give a false sense of understanding.
  
  yeah, agreed.
- Writer 16 Jan 2024 9:58 UTC
  4 points
  2
  Parent
  Also I don’t think that LLMs have “hidden internal intelligence”
  I don’t think Simulators claims or implies that LLMs have “hidden internal intelligence” or “an inner homunculus reasoning about what to simulate”, though. Where are you getting it from? This conclusion makes me think you’re referring to this post by Eliezer and not Simulators.
- quetzal_rainbow 10 Jan 2024 10:09 UTC
  3 points
  2
  Parent
  In which way is reversal curse an evidence against simulation hypothesis?