The solution is IMO just to consider the number of computations performed per generated token as some function of the model size, and once we’ve identified a suitable asymptotic order on the function, we can say intelligent things like “the smallest network capable of solving a problem in complexity class C of size N is X”.
Or if our asymptotic bounds are not tight enough:
“No economically feasible LLM can solve problems in complexity class C of size >= N”.
(Where economically feasible may be something defined by aggregate global economic resources or similar, depending on how tight you want the bound to be.)
Regardless, we can still obtain meaningful impossibility results.
Caveat to the caveat: