We don’t consider any research area to be blanket safe to publish. Instead, we consider all releases on a case by case basis, weighing expected safety benefit against capabilities/acceleratory risk. In the case of difficult scenarios, we [Anthropic] have a formal infohazard review procedure.
Doesn’t seem like it’s super public though, unlike aspects of Conjecture’s policy.
According to Chris Olah:
Doesn’t seem like it’s super public though, unlike aspects of Conjecture’s policy.