cousin_it comments on Some thoughts on AI, Philosophy, and Safety

cousin_it 22 Dec 2011 13:23 UTC
4 points
I really like your post about hazards which seems to be the most important one in the sequence. You examined an obvious-looking strategy of making an AI figure out what “humans” are, and found a crippling flaw in it. But I don’t understand some of your assumptions:

1) Why do you think that humans have a short description, or that our branch of the multiverse has a short pointer to it? I asked that question here. In one of the posts you give a figure of 10000 bits. I have no idea why that would be enough, considering the amount of quantum randomness that was involved in human evolution. I don’t even know any argument why the conditional K-complexity of one big human artifact given a full description of another big human artifact has to be low.

2) Why do you think a speed prior is a good idea for protection against the attacks? It doesn’t seem likely to me that an actual human is the most computationally efficient predictor of a human’s decisions. If we allow a little bit of error, an enemy AI can probably create much more efficient predictors.
What links here?
- XiXiDu's comment on Should we discount extraordinary implications? by XiXiDu (29 Dec 2011 18:30 UTC; -3 points)
- paulfchristiano 22 Dec 2011 19:10 UTC
  0 points
  Parent
  
  1) Why do you think that humans have a short description, or that our branch of the multiverse has a short pointer to it? I asked that question here. In one of the posts you give a figure of 10000 bits. I have no idea why that would be enough, considering the amount of quantum randomness that was involved in human evolution. I don’t even know any argument why the conditional K-complexity of one big human artifact given a full description of another big human artifact has to be low.
  
  (10000 was an upper bound no the extra penalty imposed on the “global description” if we use Levin rather than Kolmogorov complexity; the point was that 10000 bits is probably quite small compared to the K-complexity, especially if we rule out global descriptions.)
  
  I have an intuitive sense of how much surprise I have had across all of the observations I have made of the world, and this seems to be an upper bound for the negative log probability assigned by the distribution “things embedded in our universe.” If our universe has a short description, then this suggests that specifying “things embedded in the universe” and then drawing from among all of them contributes a lot to the universal prior mass on an MRI of your brain (it is much harder to say the same for Finnegan’s wake, or any other small object, since my intuitive bound is only interesting when applied to a large fraction of all of a humans’ experiences). I suspect this dominates, but am not sure.
  
  2) Why do you think a speed prior is a good idea for protection against the attacks? It doesn’t seem likely to me that an actual human is the most computationally efficient predictor of a human’s decisions. If we allow a little bit of error, an enemy AI can probably create much more efficient predictors
  
  The question is whether an enemy AI is a better predictor than the best prediction algorithm of similar complexity; of course there are AIs who are good simulators, but in general they can’t hijack the universal prior because they need to be told what they should try to simulate, and it would be cheaper at that point just to cut out the AI (though there are exceptions; I discuss one, which obtains the necessary info about the human by simulating the universe and which is therefore significantly injured by the complexity penalty)
  
  Also note that the speed prior doesn’t seem to help you very much, but I think a space-bounded version does rule out universe simulators and a few other difficulties.