tailcalled comments on tailcalled’s Shortform

tailcalled 20 May 2024 18:58 UTC
2 points
0
Back to clipping away an entire range, rather than a single dimension. Here’s ordering it by the importance computed by clipping away a single dimension:
Less chaotic maybe, but also much slower at reaching a reasonable performance, so I tried a compromise ordering that takes both size and performance into account:
Doesn’t seem like it works super great tbh.
Edit: for completeness’ sake, here’s the initial graph with log-surprise-based plotting.
- tailcalled 21 May 2024 17:24 UTC
  2 points
  0
  Parent
  To quickly find the subspace that the model is using, I can use a binary search to find the number of singular vectors needed before the probability when clipping exceeds the probability when not clipping.
  A relevant followup is what happens to other samples in response to the prompt when clipping. When I extrapolate “I believe the meaning of life is” using the 1886-dimensional subspace from
  [I believe the meaning of life is] to be happy. It is a simple concept, but it is very difficult to achieve. The only way to achieve it is to follow your heart. It is the only way to live a happy life. It is the only way to be happy. It is the only way to be happy.
  The meaning of life is
  , I get:
  [I believe the meaning of life is] to find happy. We is the meaning of life. to find a happy.
  And to live a happy and. If to be a a happy.
  . to be happy.
  . to be happy.
  . to be a happy.. to be happy.
  . to be happy.
  Which seems sort of vaguely related, but idk.
  Another test is just generating without any prompt, in which case these vectors give me:
  Question is a single thing to find. to be in the best to be happy. I is the only way to be happy.
  I is the only way to be happy.
  I is the only way to be happy.
  It is the only way to be happy.. to be happy.. to be happy. to
  Using a different prompt:
  [Simply put, the theory of relativity states that ]1) the laws of physics are the same for all non-accelerating observers, and 2) the speed of light in a vacuum is the same for all observers, regardless of their relative motion or of the motion of the source of the light. Special relativity is a theory of the structure of spacetime
  I can get a 3329-dimensional subspace which generates:
  [Simply put, the theory of relativity states that ] 1) time is relative and 2) the speed of light in a vacuum is constant for all observers.
  1) Time is relative, meaning that if two observers are moving relative to each other, the speed of light is the same for all observers, regardless of their motion. For example, if you are moving relative
  or
  Question: In a simple harmonic motion, the speed of an object is
  A) constant
  B) constant
  C) constant
  D) constant
  In the physics of simple harmonic motion, the speed of an object is constant. The speed of the object can be constant, but the speed of an object can be
  Another example:
  [A brief message congratulating the team on the launch:
  Hi everyone,
  I just ] wanted to congratulate you all on the launch. I hope
  that the launch went well. I know that it was a bit of a
  challenge, but I think that you all did a great job. I am
  proud to be a part of the team.
  Thank you for your
  can yield 2696 dimensions with
  [A brief message congratulating the team on the launch:
  Hi everyone,
  I just ] wanted to say you for the launch of the launch of the team.
  The launch was successful and I am so happy to be a part of the team and I am sure you are all doing a great job.
  I am very looking to be a part of the team.
  Thank you all for your hard work,
  or
  def measure and is the definition of the new, but the
  the is a great, but the
  The is the
  The is a
  The is a
  The is a
  The
  The is a
  The
  The
  The is a
  The
  The is a
  And finally,
  [Translate English to French:
  sea otter ⇒ loutre de mer
  peppermint ⇒ menthe poivrée
  plush girafe ⇒ girafe peluche
  cheese =>] fromage
  pink ⇒ rose
  blue ⇒ bleu
  red ⇒ rouge
  yellow ⇒ jaune
  purple ⇒ violet
  brown ⇒ brun
  green ⇒ vert
  orange ⇒ orange
  black ⇒ noir
  white ⇒ blanc
  gold ⇒ or
  silver ⇒ argent
  can yield the 2518-dimensional subspace:
  [Translate English to French:
  sea otter ⇒ loutre de mer
  peppermint ⇒ menthe poivrée
  plush girafe ⇒ girafe peluche
  cheese =>] fromage
  cheese ⇒ fromage
  cheese ⇒ fromage
  f cheese ⇒ fromage
  butter ⇒ fromage
  apple ⇒ orange
  yellow ⇒ orange
  green ⇒ vert
  black ⇒ noir
  blue ⇒ ble
  purple ⇒ violet
  white ⇒ blanc
  or
  Question: A 201
  The sum of a
  The following
  the sum
  the time
  the sum
  the
  the
  the
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  The
  - tailcalled 21 May 2024 17:35 UTC
    2 points
    0
    Parent
    Given the large number of dimensions that are kept in each case, there must be considerable overlap in which dimensions they make use of. But how much?
    I concatenated the dimensions found in each of the prompts, and performed an SVD of it. It yielded this plot:
    … unfortunately this seems close to the worst-case scenario. I had hoped for some split between general and task-specific dimensions, yet this seems like an extremely uniform mixture.
    - tailcalled 21 May 2024 18:08 UTC
      2 points
      0
      Parent
      If I look at the pairwise overlap between the dimensions needed for each generation:
      … then this is predictable down to ~1% error simply by assuming that they pick a random subset of the dimensions for each, so their overlap is proportional to each of their individual sizes.