Chris_Leong comments on Measuring and Improving the Faithfulness of Model-Generated Reasoning

Chris_Leong 19 Jul 2023 7:45 UTC
LW: 6 AF: 3
0
AF
Do you have a theory for why chain-of-thought decomposition helps?
- Ansh Radhakrishnan 19 Jul 2023 18:53 UTC
  LW: 4 AF: 3
  1
  AF Parent
  Honestly, I don’t think we have any very compelling ones! We gesture at some possibilities in the paper, such as it being harder for the model to ignore its reasoning when it’s in an explicit question-and-answer format (as opposed to a more free-form CoT), but I don’t think we have a good understanding of why it helps.
  
  It’s also worth noting that CoT decomposition helps mitigate the ignored reasoning problem, but actually is more susceptible to biasing features in the context than CoT. Depending on how you weigh the two, it’s possible that CoT might still come out ahead on reasoning faithfulness (we chose to weigh the two equally).