Steven Byrnes comments on The problems with the concept of an infohazard as used by the LW community [Linkpost]

Steven Byrnes 22 Dec 2023 19:52 UTC
34 points
22
I generally like the linked essay by Beren, but I don’t like this linkpost, especially the title, because I dispute that Beren’s essay is on the topic of “the problem with infohazards as a concept”. My very strong impression is that Beren (like me) thinks that “infohazards” is a perfectly valid and useful concept. In particular, Beren’s essay starts with:
Obligatory disclaimer: This post is meant to argue against overuse of infohazard norms in the AI safety community and demonstrate failure modes that I have personally observed. It is not an argument for never using infohazards anywhere or that true infohazards do not exist. None of this is meant to be an absolute statement. Caveat emptor. Use common sense. If you have actually stumbled across some completely new technique which speeds up training by 100x or whatever then, in general, you should not share it. Most of the concerns here are presented with a mistake theory frame – it is assumed that these pathologies arise not due to any specific bad intentions but rather due to natural social and epistemic dynamics.
My opinion is: Infohazards exist. But people have to figure out in each individual instance whether something is or isn’t an infohazard (i.e., whether the costs of keeping it secret outweigh the benefits, versus the other way around). And (from my perspective) figuring that out is generally very hard.
- For one thing, figuring out what is or isn’t an infohazard is inherently hard, because there are a bunch of considerations entering the decision, all of which involve a lot of uncertainty, partly because they may involve trying to guess things about future intellectual & tech progress (which is notoriously hard).
- For another thing, it has poor feedback mechanisms—so you can be really bad at deciding what is or isn’t an infohazard, for a very long time, without noticing and correcting.
In this background context, Beren is arguing that, in his experience, people in AI alignment are systematically going wrong by keeping things secret that, all things considered, would have been better discussed openly.
That’s (a priori) a plausible hypothesis, and it’s well worth discussing whether it’s true or false. But either way, I don’t see it as indicating a problem with infohazards as a concept.
Sorry if I’m misunderstanding the OP’s point or putting words in anyone’s mouth.
- Noosphere89 22 Dec 2023 20:03 UTC
  4 points
  0
  Parent
  While I mostly agree with you here, which is why I’ll change the title soon, I do think that the point around encouraging a great man view of science and progress is very related to the concept of infohazards as used by the LW community, because infohazards as used by the community do tend to imply that small groups or individuals can discover world ending technology, and I think a sort of “Great Person Theory” of science falls out of that.
  
  Beren’s arguing that this very model is severely wrong for most scientific fields, which is a problem with the concept of infohazards as used on LW.
  
  Edit: I did change the title.