RAB comments on We Need To Know About Continual Learning

RAB 23 May 2023 20:30 UTC
1 point
0
Worth noting that LLMs are no longer using quadratic context window scaling. See e.g. Claude-Long. Seems they’ve figured out how to make it ~linear. Looking at GPT-4 with a 32K context window option for corporate clients, seems like they’re also not using quadratic scaling any more.