Noosphere89 comments on [missing post]

Noosphere89 26 Mar 2023 21:33 UTC
1 point
0
Hm, that might be a potential point of confusion. I agree that there’s no agentic stuff, at least without RL or a memory source, but the LLM is still pursuing the goal of maximizing the likelihood of the training data, which comes apart pretty quickly from the preferences of humans, for many reasons.

You’re right that it doesn’t actively intervene, mostly because of the following:
1. There’s no RL, usually.
2. It is memoryless, in the sense that it forgets itself.
3. It doesn’t have a way to store arbitrarily long/complex problems in their memory, nor can it write memories to a brain.
But the Maximum Likelihood Estimation goal still gives you misaligned behavior, and I’ll give you examples:

Completing buggy Python code in a buggy way

https://arxiv.org/abs/2107.03374

Or to espouse views consistent with those expressed in the prompt (sycophancy).

https://arxiv.org/pdf/2212.09251.pdf

So the LLM is still optimizing for Maximum Likelihood Estimation, it just has certain limitations so that it just misaligns it passively, instead of actively.