I sympathize with the annoyance, but I think the response from the broader safety crowd (e.g., your Manifold market, substantive critiques and general ill-reception on LessWrong) has actually been pretty healthy overall; I think it’s rare that peer review or other forms of community assessment work as well or quickly.
Hendryks had ample opportunity after initial skepticism to remove it, but chose not to.
IMO, this seems to demand a very immediate/sudden/urgent reaction. If Hendrycks ends up being wrong, I think he should issue some sort of retraction (and I think it would be reasonable to be annoyed if he doesn’t.)
But I don’t think the standard should be “you need to react to criticism within ~24 hours” for this kind of thing. If you write a research paper and people raise important concerns about it, I think you have a duty to investigate them and respond to them, but I don’t think you need to fundamentally change your mind within the first few hours/days.
I think we should afford researchers the time to seriously evaluate claims/criticisms, reflect on them, and issue a polished statement (and potential retraction).
(Caveat that there are some cases where immediate action is needed– like EG if a company releases a product that is imminently dangerous– but I don’t think “making an intellectual claim about LLM capabilities that turns out to be wrong” would meet my bar.)
Under peer review, this never would have been seen by the public. It would have incentivized CAIS to actually think about the potential flaws in their work before blasting it to the public.
I sympathize with the annoyance, but I think the response from the broader safety crowd (e.g., your Manifold market, substantive critiques and general ill-reception on LessWrong) has actually been pretty healthy overall; I think it’s rare that peer review or other forms of community assessment work as well or quickly.
IMO, this seems to demand a very immediate/sudden/urgent reaction. If Hendrycks ends up being wrong, I think he should issue some sort of retraction (and I think it would be reasonable to be annoyed if he doesn’t.)
But I don’t think the standard should be “you need to react to criticism within ~24 hours” for this kind of thing. If you write a research paper and people raise important concerns about it, I think you have a duty to investigate them and respond to them, but I don’t think you need to fundamentally change your mind within the first few hours/days.
I think we should afford researchers the time to seriously evaluate claims/criticisms, reflect on them, and issue a polished statement (and potential retraction).
(Caveat that there are some cases where immediate action is needed– like EG if a company releases a product that is imminently dangerous– but I don’t think “making an intellectual claim about LLM capabilities that turns out to be wrong” would meet my bar.)
What is reasonable here? 2 weeks? 2 months?
Hm, good question. I think it should be proportional to the amount of time it would take to investigate the concern(s).
For this, I think 1-2 weeks seems reasonable, at least for an initial response.
Under peer review, this never would have been seen by the public. It would have incentivized CAIS to actually think about the potential flaws in their work before blasting it to the public.