skybrian comments on The Waluigi Effect (mega-post)

skybrian 4 Mar 2023 22:45 UTC
1 point
0
Fair enough; comparing to quantum physics was overly snarky.

However, unless you have debug access to the language model and can figure out what specific neurons do, I don’t see how the notion of superposition is helpful? When figuring things out from the outside, we have access to words, not weights.
- the gears to ascension 6 Mar 2023 11:35 UTC
  2 points
  0
  Parent
  the value of thinking in terms of superposition is that the distribution of possible continuations is cut down sharply by each additional word; before adding a word, the distribution of possible continuations is wide, and a distribution of possible continuations is effectively a superposition of possibilities. current models only let you sample from that distribution, but the neuron activations can be expected, at each iteration, to have structure that more or less matches the uncertainty over how the sentence might continue.
  
  I actually think the fact that this has been how classical multimodal probability distributions worked the whole time has been part of why people latch onto quantum wording. It’s actually true, and humans know it, that there are quantum-sounding effects at macroscopic scale, because a lot of what’s weird about quantum is actually just the weirdness of probability! but the real quantum effects are so dramatically much weirder than classical probability due to stuff I don’t quite understand, like the added behavior of complex valued amplitudes and the particular way complex valued destructive interference works at quantum scales. Which all is to say, don’t be too harsh on people who bring up quantum incorrectly, they’re trying.
  - Bill Benzon 6 Mar 2023 13:00 UTC
    5 points
    0
    Parent
    Note that stories are organized above the sentence level. I have just been examining stories that have two levels above sentences: segments of the whole story trajectory, and the whole trajectory. Longer stories could easily have more levels than that.
    It appears to me that, once ChatGPT begins to tell a story, the distribution of possibilities for the whole story is fixed. The story then unfolds within that wider distribution. Each story segment has its own distribution within that wider distribution, and each sentence has an even narrower range of possibilities, but all within its particular story segment.
    Now, let’s say that we have a story about Princess Aurora. I asked ChatGPT to tell me a new story based on the Aurora story. But, instead of Aurora being the protagonist, the protagonist is XP-708-DQ. What does ChatGPT do? (BTW, this is experiment 6 from my paper.)
    It tells a new story, but shifts it from a fairytale ethos – knights, dragons – to a science fiction ethos where XP-708-DQ is a robot and the galaxy (which is “far, far away”) is attacked by aliens in space ships. Note that I did not explicitly say that XP-708-DQ was a robot. ChatGPT simply assumed that it was, which is what I expected it to do. Given e.g. R2D2 and C3P0, that’s a reasonable assumption.
    What have, it would seem, is an abstract scheme for a story, with a bunch of slots (variables) that can be filled in to define the nature of the world, slots for a protagonist and an antagonist, slots for actions taken, and so forth. A fairy tale fleshes out the schema in one way, a science fiction story fleshes it out in a different way. In my paper I perform a bunch of experiments in which I ‘force’ ChatGPT to change how the slots are filled. When Princess Aurora is swapped for Prince Henry (experiment 1), only a small number of slots have to be filled in a different way. When she’s swapped for XP-708-DQ, a lot of slots are filled in a different way. That’s also the case when Aurora becomes a giant chocolate milkshake (experiment 7). The antagonist is switched from a dragon to an erupting volcano whose heat melts all it encounters.