For GPT-style LLMs, is it possible to prove statements like the following?
Choose some tokens A, B and a fixed LLM:
There does not exist a prefix of tokens P such that LLM(P+A)→B
More generally, is it possible to prove interesting universal statements? Sure, you can brute force it for LLMs with a finite context window but that’s both infeasible and boring. And you can specifically construct contrived LLMs where this is possible but that’s also boring.
I suspect that it’s not possible/practical in general because the LLM can do arbitrary computation to predict the next token, but maybe I’m wrong.
Yes, in general statements like this are theoretically possible to prove, but not remotely practical. There might be some specific (A,B,LLM) triples for which you can prove such a statement but I expect that none of these are generalizable to actually useful statements.
No GPT-style architecture is (in itself) capable of truly universal computation, but in practice functions they can implement are far beyond our ability to adequately analyze.
For GPT-style LLMs, is it possible to prove statements like the following?
Choose some tokens A, B and a fixed LLM:
More generally, is it possible to prove interesting universal statements? Sure, you can brute force it for LLMs with a finite context window but that’s both infeasible and boring. And you can specifically construct contrived LLMs where this is possible but that’s also boring.
I suspect that it’s not possible/practical in general because the LLM can do arbitrary computation to predict the next token, but maybe I’m wrong.
Yes, in general statements like this are theoretically possible to prove, but not remotely practical. There might be some specific (A,B,LLM) triples for which you can prove such a statement but I expect that none of these are generalizable to actually useful statements.
No GPT-style architecture is (in itself) capable of truly universal computation, but in practice functions they can implement are far beyond our ability to adequately analyze.