DSimon comments on So You Want to Save the World

DSimon 6 Jan 2012 23:03 UTC
0 points

No. I’m saying that if I value X, I can’t think of any information that would cause me to value NOT(X) instead.

Er, this seems to imply that you believe yourself immune to being hacked, which can’t be right; human brains are far from impregnable. Do you consider such things to not be information in this context, or are you referring to “I” in a general “If I were an AI” sense, or something else?
- TheOtherDave 6 Jan 2012 23:16 UTC
  2 points
  Parent
  Mm, interesting question. I think that when I said it, I was referring to “I” in a “if I were an AI” sense. Or, rather, “if I were an AI properly designed to draw inferences from information while avoiding value drift,” since of course it’s quite possible to build an AI that doesn’t have this property. I was also clearly assuming that X is the only thing I value; if I value X and Y, discovering that Y implies NOT(X) might lead me to value NOT(X) instead. (Explicitly, I mean. In this example I started out valuing X and NOT(X), but I didn’t necessarily know it.)
  
  But the question of what counts as information (as opposed to reprogramming attempts) is an intriguing one that I’m not sure how to address. On five seconds thought, it seems clear that there’s no clear line to be drawn between information and attempts to hack my brain, and that if I want such a distinction to exist I need to design a brain that enforces that kind of security… certainly evolution hasn’t done so.