Christopher King comments on LLMs and computation complexity

Christopher King 29 Apr 2023 1:18 UTC
1 point
0
Nitpick: a language model is basically just an algorithm to predict text. It doesn’t necessarily need to be a fixed architecture like ChatGPT. So for example: “get ChatGPT to write a program that outputs the next token and then run that program” is technically also a language model, and has no computational complexity limit (other than the underlying hardware).
- bayesed 29 Apr 2023 18:08 UTC
  3 points
  0
  Parent
  Hmm, I’ve not seen people refer to (ChatGPT + Code execution plugin) as an LLM. IMO, an LLM is supposed to be language model consisting of just a neural network with a large number of parameters.
  - Jonathan Marcus 30 Apr 2023 19:00 UTC
    1 point
    0
    Parent
    I think your definition of LLM is the common one. For example, https://www.lesswrong.com/posts/KJRBb43nDxk6mwLcR/ai-doom-from-an-llm-plateau-ist-perspective is on the front page right now, and it uses LLM to refer to a big neural net, in a transformer topology, trained with a lot of data. This is how I was intending to use it as well. Note the difference between “language model” as Christopher King used it, and “large language model” as I am. I plan to keep using LLM for now, especially as GPT refers to OpenAI’s product and not the general class of things.
- Jonathan Marcus 29 Apr 2023 10:48 UTC
  1 point
  0
  Parent
  Thanks, this is exactly the kind of feedback I was hoping for.
  Nomenclature-wise: I was using LLM to mean “deep neural nets in the style of GPT-3” but I should be more precise. Do you know of a good term for what I meant?
  More generally, I should learn about other styles of LLM. I’ve gotten some good leads from these comments and some DMs.
  - Christopher King 29 Apr 2023 13:26 UTC
    3 points
    0
    Parent
    Hmm, how about GPT (generative pre-trained transformer)?