avturchin comments on A smart enough LLM might be deadly simply if you run it for long enough

avturchin 28 Apr 2023 10:37 UTC
2 points
0
I see, thanks! At first I thought that if we, say, train LLM on the history of french revolution, it will have a model of Napoleon and this model—or at least associated with it capabilities—will start getting control over LLM-output. But now it more look like Pelevin novel “T” where a character slowly start to understand that he is in output of something like LLM. But the character also evolves via Darwinian evolution to become something like alien.
So the combination of
models of agentic and highly capable characters inside LLM
shaped by Darwinian evolution into something non-human
becoming LLM-lucid—that is, getting understanding that it is in LLM
ends in appearance of dangerous behavior.
Now knowing all this—how could I know that I am not inside LLM? :)