Yeah, seems like it’d be good to test variations on the prompt. Test completions at different temperatures, with and without ‘think step by step’, with and without allowing for the steps to be written out by the model before the model gives a final answer (this seems to help sometimes), with substitutions of synonyms into the prompt to vary the exact wording without substantially changing the meaning… I suspect you’ll find that the outputs vary a lot, and in inconsistent ways, unlike what you’d expect from a person with clear reflectively-endorsed opinions on the matter.
Yeah, seems like it’d be good to test variations on the prompt. Test completions at different temperatures, with and without ‘think step by step’, with and without allowing for the steps to be written out by the model before the model gives a final answer (this seems to help sometimes), with substitutions of synonyms into the prompt to vary the exact wording without substantially changing the meaning… I suspect you’ll find that the outputs vary a lot, and in inconsistent ways, unlike what you’d expect from a person with clear reflectively-endorsed opinions on the matter.