Do people generally consider wireheading and Goodheart’s law as information hazards? They’re both “errors” caused by access to true data, but that is easy to misuse.
I would say that, to the extent to true information allowed those outcomes to occur, and to the extent that those outcomes were harmful, the true information posed an information hazard. I.e., there was a risk of harm from true information, and the harm was that harmful wireheading or Goodharting happened.
I.e., I’d say that the wireheading or Goodharting isn’t itself an information hazard; it’s the harm that the information led to.
In the same way, it’s not that a pandemic is an information hazard, but that a pandemic is a harm that spreading certain information could lead to, which makes that information hazardous.
Does that make sense?
ETA: I guess an implicit assumption I’m making is that the access to the true information made the harmful wireheading or Goodharting more likely, in these hypotheticals. If it happens to be that any random data, or a decent subset of all possible false data, would’ve also led to the harmful outcomes with similar likelihood, then there isn’t necessarily a risk arising from true information.
I suggested that, while a set of plans for a nuclear reactor might be true, and safe if executed correctly, if not executed correctly this might have similar effects to a nuke. Thus ‘stability’ - if something is (almost) impossible for humans to execute correctly, and is unsafe in the event it is performed even slightly incorrectly, then it is ‘unstable’ (and dangerous in a different way than ‘stable’ designs for a nuke).
Misuse starts to get into relativity—someone who would never use plans to build a nuke isn’t harmed by receiving them (absent other actors trying to steal said plans from them), which means information hazards are relative.
Do people generally consider wireheading and Goodheart’s law as information hazards? They’re both “errors” caused by access to true data, but that is easy to misuse.
Interesting.
I would say that, to the extent to true information allowed those outcomes to occur, and to the extent that those outcomes were harmful, the true information posed an information hazard. I.e., there was a risk of harm from true information, and the harm was that harmful wireheading or Goodharting happened.
I.e., I’d say that the wireheading or Goodharting isn’t itself an information hazard; it’s the harm that the information led to.
In the same way, it’s not that a pandemic is an information hazard, but that a pandemic is a harm that spreading certain information could lead to, which makes that information hazardous.
Does that make sense?
ETA: I guess an implicit assumption I’m making is that the access to the true information made the harmful wireheading or Goodharting more likely, in these hypotheticals. If it happens to be that any random data, or a decent subset of all possible false data, would’ve also led to the harmful outcomes with similar likelihood, then there isn’t necessarily a risk arising from true information.
I suggested that, while a set of plans for a nuclear reactor might be true, and safe if executed correctly, if not executed correctly this might have similar effects to a nuke. Thus ‘stability’ - if something is (almost) impossible for humans to execute correctly, and is unsafe in the event it is performed even slightly incorrectly, then it is ‘unstable’ (and dangerous in a different way than ‘stable’ designs for a nuke).
Misuse starts to get into relativity—someone who would never use plans to build a nuke isn’t harmed by receiving them (absent other actors trying to steal said plans from them), which means information hazards are relative.