I suggest removing Gleave’s critique of guaranteed safe AI. It’s not object-level, doesn’t include any details, and is mostly just vibes.
My main hesitation is I feel skeptical of the research direction they will be working on (theoretical work to support the AI Scientist agenda). I’m both unconvinced of the tractability of the ambitious versions of it, and more tractable work like the team’s previous preprint on Bayesian oracles is theoretically neat but feels like brushing the hard parts of the safety problem under the rug.
Gleave doesn’t provide any reasons for why he is unconvinced of the tractability of the ambitious versions of guaranteed safe AI. He also doesn’t provide any reason why he thinks that Bayesian oracle paper brushes the hard parts of the safety problem under the rug.
His critique is basically, “I saw it, and I felt vaguely bad about it.” I don’t think it should be included, as it dilutes the thoughtful critiques and doesn’t provide much value to the reader.
I think your comment adds a relevant critique of the criticism, but given that this comes from someone contributing to the project, I don’t believe it’s worth leaving it out altogether. I added a short summary and a hyperlink to a footnote.
I suggest removing Gleave’s critique of guaranteed safe AI. It’s not object-level, doesn’t include any details, and is mostly just vibes.
Gleave doesn’t provide any reasons for why he is unconvinced of the tractability of the ambitious versions of guaranteed safe AI. He also doesn’t provide any reason why he thinks that Bayesian oracle paper brushes the hard parts of the safety problem under the rug.
His critique is basically, “I saw it, and I felt vaguely bad about it.” I don’t think it should be included, as it dilutes the thoughtful critiques and doesn’t provide much value to the reader.
I think your comment adds a relevant critique of the criticism, but given that this comes from someone contributing to the project, I don’t believe it’s worth leaving it out altogether. I added a short summary and a hyperlink to a footnote.
Sounds good to me!