Figure out a way to show users the CoT of reasoning/agent models that you release in the future. (i.e. don’t do what OpenAI did with o1). Doesn’t have to be all of it, just has to be enough—e.g. each user gets 1 CoT view per day.
What would be the purpose of 1 CoT view per user per day?
For scientific purposes. People don’t really have time to review that many CoT chains anyway, so 1 per day gets most of the value of what they’d realistically do. Plus they can target it at the stuff that’s suspicious. (Simple example: Suppose they get an impressive-seeming answer that later turns out to be total BS hallucination. They then think “I wonder if the model was BSing me” and click “view CoT.” Then they see whether it was an innocent mistake or not.)
What would be the purpose of 1 CoT view per user per day?
For scientific purposes. People don’t really have time to review that many CoT chains anyway, so 1 per day gets most of the value of what they’d realistically do. Plus they can target it at the stuff that’s suspicious. (Simple example: Suppose they get an impressive-seeming answer that later turns out to be total BS hallucination. They then think “I wonder if the model was BSing me” and click “view CoT.” Then they see whether it was an innocent mistake or not.)