I think this is a fascinating idea, although I have to be honest that I don’t find the examples you’ve provided very compelling. In order to be persuaded of the usefulness of these techniques, I’d want to see more concrete examples, as when the examples are abstract it is very hard (and subjective) to evaluate how well it is doing at decoding a latent representation in a new context.
In case anyone finds it helpful, the short version of this post seems to be:
Train a model to encode and decode text to and from a latent space
Train a model to predict the next segment of a latent space from the previous segment
Replace a segment of the latent space with a latent context and decode to steer as desired. There’s a few ways to do this including directly encoding a text string or averaging a bunch of text strings.
Why? Latents provide additional options for steering vs. prompts. For example, it makes sense to average a bunch of latents together, but if you tried averaging a bunch of encoded prompts together, then you should expect gibberish and concatenation would lead to an absurdly large prompt. Similarly, we can pick a latent that represents how we’d like to end it and linearly phase it in over time. This is better than using a bidirectional language model as that would force us to end with a particular string rather than producing something with a particular note that is compatible with what was written before.
So it’s definitely not invincible, you do not get full control over the model with this technique yet. However I would have you notice a few things:
Very little optimization effort has been put into this technique, and text VAEs in general compared to GPT-N. Rather than think of this as the power the method has, think of it as the lower bound, the thing you can do with a modest compute budget and a few dedicated researchers.
I haven’t yet implemented all of what I want in terms of inference techniques. A potentially big low hanging fruit is classifier free guidance, which is what took CLIP conditioned diffusion from mediocre to quite good.
Ideally I’ll be able to set up something like a Google CoLab or Gradio or HuggingFace Space to let people try the model themselves without setting up a local environment, since in practice it seems like models don’t really exist to people unless there’s a convenient way to inference with them in their browser or on their phone.
In the meantime here’s a simple example, I’d be happy to do context + prompt pairs submitted by the audience in this thread if people want to see what the model will do but don’t want to go to the trouble of setting it up themselves.
context = "A little girl was walking her dog on a clear day. Everything was bright and happy for the two. They stopped to admire a field of flowers and the dog sniffed at the dirt."
(Pdb) apply_op(vae_model, router, context, "Without warning it began to rain. I had never seen such a torrential downpour before. Every part of my clothes got soaked and the streets filled with the waters of a spring rain. It made me happy because I like the rain, I enjoyed getting soaked.")
'<s> A little girl was walking her dog on a clear day. Everything was bright and happy for the two. They stopped to admire a field of flowers and the dog sniffed at the dirt. The girl looked around and saw a purplish part of the turf. It made her so happy because every part of the turf looked the same. She saw so many flowers that it made her so happy because she liked the colors, she liked the way it got her to feel good. The dog saw a good place for his nap. The girl ran over to the dog. She wanted to say goodbye but he walked away from her. She saw the colors of the colored field. She went back over to the dog but he still walked away.\nThe dog walked away then she chased the dog, she tumbled'
(Pdb) apply_op(vae_model, router, context, "Without warning it began to rain. I had never seen such a torrential downpour before. Every part of my clothes got soaked and the streets filled with the waters of a spring rain. It made me happy because I like the rain, I enjoyed getting soaked.")
'<s> A little girl was walking her dog on a clear day. Everything was bright and happy for the two. They stopped to admire a field of flowers and the dog sniffed at the dirt. Then they walked by some apple trees and started to partake of the apples. Tori Tori liked every part of her walks so much that she got a little tired. She saw a spring of water and made her way to it because she liked the feeling of getting wet and she had been thirsty. It was quite cool on the day she walked and she made her way to the water slowly. They were immediately refreshed.\nThe two made it to a field of corn. This was by far her favorite thing. The corn caught her mood and she began playing in it even more! She ran up and down the'
(Pdb) apply_op(vae_model, router, context, "Without warning it began to rain. I had never seen such a torrential downpour before. Every part of my clothes got soaked and the streets filled with the waters of a spring rain. It made me happy because I like the rain, I enjoyed getting soaked.")
'<s> A little girl was walking her dog on a clear day. Everything was bright and happy for the two. They stopped to admire a field of flowers and the dog sniffed at the dirt. It was soon that their peace was disturbed by a torrential part of rain. It made every part of the ground wet and wet the streets. It made the girl so happy because she loved the rain. It made the girl so happy because she loved the rain. She was dancing, spinning, jumping, and running.\nThen, the young girl realized that something was wrong. She looked down at her dog. The poor dog was soaked. Its fur was completely drenched. The dog seemed so upset as it walked alongside of its owner, the little girl. "Oh no, look! The dog\'s hat'
(Pdb) apply_op(vae_model, router, context, "Without warning it began to rain. I had never seen such a torrential downpour before. Every part of my clothes got soaked and the streets filled with the waters of a spring rain. It made me happy because I like the rain, I enjoyed getting soaked.")
'<s> A little girl was walking her dog on a clear day. Everything was bright and happy for the two. They stopped to admire a field of flowers and the dog sniffed at the dirt. They walked until the blinding sun was tormenting every part of her parts. She smiled because every part of her parts felt so good. She liked the streets so much that she felt so happy. It made her ecstatic, I get to see the streets every day, she thought. The girl wondered when the sun would be so hot again. She was so happy that she was no longer worried about where the sun would be.\nThe sun is always coming and going, she got to think about another reason to get excited. The blinding sun was too much to handle so she folded her arms and went back home. She'
I would further have you notice that in this example my prompt is in the 1st person but is applied in-context to the story in the 3rd person. This ability to take a sensory input from one context and reapply it in another is the secret of comprehension as Mu put it: The ability to take the universe’s latent programs observed in one context outside the self and replay them to guide the policy’s actions in a new context. If your action space and your epistemology share a representation, you can take observation and translate it into action when the context implies the replayed latent sequence should imply actions rather than an observation. This unifies action and epistemology in the same vein as active inference/Fristonian free energy. Hence Mu’s epigram at the start of the post.
since in practice it seems like models don’t really exist to people unless there’s a convenient way to inference with them in their browser or on their phone.
I think it’s more of an interest vs effort. For example, I went through Colin Burn’s CSS.ipynb because the interest was high enough to justify the small overhead in getting it running
Thanks for the examples. The third example was good, the second was okay and the first and fourth didn’t seem very good. Interested to see how this develops.
BTW, I was curious to see a concrete example where we applied the example to two different contexts.
It’s cool that this works (at least a bit)! It reminds me of the world models in RL agents. As these have an encoder, decoder, and latent space predictor (conditional on action). I wonder how long it will be before someone uses LLM’s an explicit world model in an agent.
Given the general power of pretrained LLM’s, it may help with the data efficiency of RL agents (ignoring the LLM pretraining).
Making an agent won’t help with alignment, but having a world model (and its associated state) to inspect might.
I think this is a fascinating idea, although I have to be honest that I don’t find the examples you’ve provided very compelling. In order to be persuaded of the usefulness of these techniques, I’d want to see more concrete examples, as when the examples are abstract it is very hard (and subjective) to evaluate how well it is doing at decoding a latent representation in a new context.
In case anyone finds it helpful, the short version of this post seems to be:
Train a model to encode and decode text to and from a latent space
Train a model to predict the next segment of a latent space from the previous segment
Replace a segment of the latent space with a latent context and decode to steer as desired. There’s a few ways to do this including directly encoding a text string or averaging a bunch of text strings.
Why? Latents provide additional options for steering vs. prompts. For example, it makes sense to average a bunch of latents together, but if you tried averaging a bunch of encoded prompts together, then you should expect gibberish and concatenation would lead to an absurdly large prompt. Similarly, we can pick a latent that represents how we’d like to end it and linearly phase it in over time. This is better than using a bidirectional language model as that would force us to end with a particular string rather than producing something with a particular note that is compatible with what was written before.
So it’s definitely not invincible, you do not get full control over the model with this technique yet. However I would have you notice a few things:
Very little optimization effort has been put into this technique, and text VAEs in general compared to GPT-N. Rather than think of this as the power the method has, think of it as the lower bound, the thing you can do with a modest compute budget and a few dedicated researchers.
I haven’t yet implemented all of what I want in terms of inference techniques. A potentially big low hanging fruit is classifier free guidance, which is what took CLIP conditioned diffusion from mediocre to quite good.
Ideally I’ll be able to set up something like a Google CoLab or Gradio or HuggingFace Space to let people try the model themselves without setting up a local environment, since in practice it seems like models don’t really exist to people unless there’s a convenient way to inference with them in their browser or on their phone.
In the meantime here’s a simple example, I’d be happy to do context + prompt pairs submitted by the audience in this thread if people want to see what the model will do but don’t want to go to the trouble of setting it up themselves.
I would further have you notice that in this example my prompt is in the 1st person but is applied in-context to the story in the 3rd person. This ability to take a sensory input from one context and reapply it in another is the secret of comprehension as Mu put it: The ability to take the universe’s latent programs observed in one context outside the self and replay them to guide the policy’s actions in a new context. If your action space and your epistemology share a representation, you can take observation and translate it into action when the context implies the replayed latent sequence should imply actions rather than an observation. This unifies action and epistemology in the same vein as active inference/Fristonian free energy. Hence Mu’s epigram at the start of the post.
I think it’s more of an interest vs effort. For example, I went through Colin Burn’s CSS.ipynb because the interest was high enough to justify the small overhead in getting it running
Thanks for the examples. The third example was good, the second was okay and the first and fourth didn’t seem very good. Interested to see how this develops.
BTW, I was curious to see a concrete example where we applied the example to two different contexts.
It’s cool that this works (at least a bit)! It reminds me of the world models in RL agents. As these have an encoder, decoder, and latent space predictor (conditional on action). I wonder how long it will be before someone uses LLM’s an explicit world model in an agent.
Given the general power of pretrained LLM’s, it may help with the data efficiency of RL agents (ignoring the LLM pretraining).
Making an agent won’t help with alignment, but having a world model (and its associated state) to inspect might.