leogao comments on OpenAI API base models are not sycophantic, at any size

leogao 29 Aug 2023 4:37 UTC
LW: 23 AF: 11
0
AF
Ran this on GPT-4-base and it gets 56.7% (n=1000)
- cubefox 30 Aug 2023 20:32 UTC
  9 points
  0
  Parent
  How?! I’m pretty sure the GPT-4 base model is not publicly available!
  - habryka 30 Aug 2023 20:40 UTC
    15 points
    0
    Parent
    Leo Gao works at OpenAI: https://twitter.com/nabla_theta?lang=en
  - jacquesthibs 30 Aug 2023 20:51 UTC
    6 points
    7
    Parent
    Leo works at OAI, but I believe OAI gives access to base GPT-4 to some outsider researchers as well.
- Ethan Perez 29 Aug 2023 20:00 UTC
  LW: 9 AF: 4
  3
  AF Parent
  Are you measuring the average probability the model places on the sycophantic answer, or the % of cases where the probability on the sycophantic answer exceeds the probability of the non-sycophantic answer? (I’d be interested to know both)
- Quintin Pope 29 Aug 2023 5:01 UTC
  LW: 9 AF: 4
  0
  AF Parent
  What about RLHF’d GPT-4?