DragonGod comments on LLMs and computation complexity

DragonGod 29 Apr 2023 23:23 UTC
2 points
0
Very big caveat: the LLM doesn’t actually perform O(1) computations per generated token.

The number of computational steps performed per generated token scales with network size: https://www.lesswrong.com/posts/XNBZPbxyYhmoqD87F/llms-and-computation-complexity?commentId=QWEwFcMLFQ678y5Jp
- DragonGod 30 Apr 2023 4:13 UTC
  2 points
  0
  Parent
  Caveat to the caveat:
  
  The solution is IMO just to consider the number of computations performed per generated token as some function of the model size, and once we’ve identified a suitable asymptotic order on the function, we can say intelligent things like “the smallest network capable of solving a problem in complexity class C of size N is X”.
  
  Or if our asymptotic bounds are not tight enough:
  
  “No economically feasible LLM can solve problems in complexity class C of size >= N”.
  
  (Where economically feasible may be something defined by aggregate global economic resources or similar, depending on how tight you want the bound to be.)
  
  Regardless, we can still obtain meaningful impossibility results.