Vladimir_Nesov comments on Instantiating an agent with GPT-4 and text-davinci-003

Vladimir_Nesov 20 Mar 2023 0:46 UTC
3 points
1
Finally, a prospective bureaucracy design!

text-davinci-003 … is also less restricted than you are: it has not been subject to reinforcement learning with human feedback as you were

Based on the OpenAI models page and the InstructGPT paper, I’m guessing text-davinci-003 is the closest thing to something trained with RLHF in the API, besides the more specific gpt-3.5-turbo (ChatGPT-3.5 model).
- Max H 20 Mar 2023 0:57 UTC
  2 points
  0
  Parent
  Ah yeah, looking more closely at the history and documentation, I think this is right.
  
  Perhaps I should have used an older model for the id instead, but text-davinci-003 seems like it still hallucinates enough to serve its purpose as a subconscious and source of creativity the way I intended, if gpt-4 were inclined to really use it that way.
  
  That may not be the only inaccuracy in the system message :)