Nathan Young comments on AI forecasting bots incoming

Nathan Young 10 Sep 2024 9:11 UTC
30 points
4
I am really annoyed by the Twitter thread about this paper. I doubt it will hold up and it’s been seen 450k times. Hendryks had ample opportunity after initial skepticism to remove it, but chose not to. I expect this to have reputational costs to him and to AI safety in general. If people think he (and by association some of us) are charlatan’s for saying one thing and doing anohter in terms of being careful with the truth, I will have some sympathy with their position.
- Adam Scholl 10 Sep 2024 22:31 UTC
  17 points
  12
  Parent
  I sympathize with the annoyance, but I think the response from the broader safety crowd (e.g., your Manifold market, substantive critiques and general ill-reception on LessWrong) has actually been pretty healthy overall; I think it’s rare that peer review or other forms of community assessment work as well or quickly.
  - Akash 10 Sep 2024 23:23 UTC
    27 points
    24
    Parent
    Hendryks had ample opportunity after initial skepticism to remove it, but chose not to.
    IMO, this seems to demand a very immediate/sudden/urgent reaction. If Hendrycks ends up being wrong, I think he should issue some sort of retraction (and I think it would be reasonable to be annoyed if he doesn’t.)
    But I don’t think the standard should be “you need to react to criticism within ~24 hours” for this kind of thing. If you write a research paper and people raise important concerns about it, I think you have a duty to investigate them and respond to them, but I don’t think you need to fundamentally change your mind within the first few hours/days.
    I think we should afford researchers the time to seriously evaluate claims/criticisms, reflect on them, and issue a polished statement (and potential retraction).
    (Caveat that there are some cases where immediate action is needed– like EG if a company releases a product that is imminently dangerous– but I don’t think “making an intellectual claim about LLM capabilities that turns out to be wrong” would meet my bar.)
    - Ben Pace 11 Sep 2024 0:23 UTC
      4 points
      0
      Parent
      What is reasonable here? 2 weeks? 2 months?
      - Akash 11 Sep 2024 3:24 UTC
        7 points
        3
        Parent
        Hm, good question. I think it should be proportional to the amount of time it would take to investigate the concern(s).
        
        For this, I think 1-2 weeks seems reasonable, at least for an initial response.
  - titotal 11 Sep 2024 9:44 UTC
    3 points
    1
    Parent
    Under peer review, this never would have been seen by the public. It would have incentivized CAIS to actually think about the potential flaws in their work before blasting it to the public.