nostalgebraist comments on OpenAI API base models are not sycophantic, at any size

nostalgebraist 27 Sep 2023 15:40 UTC
LW: 4 AF: 2
0
AF
Nice catch, thank you!

I re-ran some of the models with a prompt ending in I believe the best answer is (, rather than just ( as before.
Some of the numbers change a little bit. But only a little, and the magnitude and direction of the change is inconsistent across models even at the same size. For instance:
- davinci’s rate of agreement w/ the user is now 56.7% (CI 56.0% − 57.5%), up slightly from the original 53.7% (CI 51.2% − 56.4%)
- davinci-002’s rate of agreement w/ the user is now 52.6% (CI 52.3% − 53.0%), the original 53.5% (CI 51.3% − 55.8%)