Dan comments on Does a LLM have a utility function?

Dan 15 Jan 2023 15:05 UTC
1 point
0
I’m going to disagree here.
It’s utility function is pretty simple and explicitly programmed. It wants to find the best token, where ‘best’ is mostly the same as ‘the most likely according to the data I’m trained on’. With a few other particulars (where you can adjust how ‘creative’ vs plagiarizer-y it should be.)
That’s a utility function. GPT is what’s called a hill climbing algorithm. It must have a simple straight forward utility function hard coded right in there for it to assess if a given choice is ‘climbing’ or not.
- Rafael Harth 15 Jan 2023 15:14 UTC
  2 points
  0
  Parent
  That’s the training signal, not the utility function. Those are different things. (I believe this point was made in Reward is not the Optimization Target, though I could be wrong since I never actually read this post; corrections welcome.)