How do you handle anthropic scenarios (sleeping beauty, presumptuous philosopher, doomsday argument)?
By defining the utility function in terms of the universe, rather than a subjective-experience path. Anthropics are a problem for humans because our utility functions are defined in terms of a “self” which is defined in a way that does not generalize well; for an AI, this would be a problem for us writing the utility function to give it, but not for the AI doing the optimization.
Imagine the AI wants to determine what the would universe look like if it filled it with paperclips. It is superintelligent, so therefore it would have to have a very good reason. Since, in this hypothetical situation, there is a very good reason to fill the universe with paperclips, it would get lots of utility. Therefore, it is a good reason.
There are two utility functions here, and the AI isn’t optimizing the paperclip one.
Your premises preclude the possibility of meeting another AI, which could be very dangerous.
This premise is unnecessary, since that possibility can be folded into the probability model.
Anthropics are a problem for humans because our utility functions are defined in terms of a “self” which is defined in a way that does not generalize well; for an AI, this would be a problem for us writing the utility function to give it, but not for the AI doing the optimization.
So if the AI were building a robot that had to bet in a presumptuous philosopher or doomsday scenario, how would it bet in each? You do already have the right answer to sleeping beauty.
There are two utility functions here, and the AI isn’t optimizing the paperclip one.
I used a ridiculously bad example. I was trying to ask what would happen if the AI considered the possibility that tiling the universe with paperclips would actually satisfy its own utility function. That is very implausible, but that just means that if a superintelligence were to decide to do so, it would have a very good reason.
This premise is unnecessary, since that possibility can be folded into the probability model.
No, you would defect on the true prisoners dilemma.
So if the AI were building a robot [...] how would it bet in each?
That would depend on what the AI hoped to achieve by building the robot. It seems to me that specifying that clearly should determine what approach the AI would want the robot to take in such situations.
(More generally, it seems to me that a lot of anthropic puzzles go away if one eschews indexicals. Whether that’s any use depends on how well one can do without indexicals. I can’t help suspecting that the Right Answer may be “perfectly well, and things that can only be said with indexicals aren’t really coherent”, which would be … interesting. In case it’s not obvious, everything in this paragraph is quarter-baked at best and may not actually make any sense. I think I recall that someone else posted something in LW-discussion recently with a similar flavour.)
Just to clarify, I am not the author of the original article; I haven’t proposed any particular naive decision theory. (I don’t think “treat indexicals as suspect” qualifies as one!)
Unfortunately, my mind is not the product of a clearthinking AI and my values do (for good or ill) have indexically-defined stuff in them: I care more about myself than about random other people, for instance. That may or may not be coherent, but it’s how I am. And so I don’t know what the “right” way to get indexicals out of the rest of my thinking would be. And so I’m not sure how I “should” bet in the presumptuous philosopher case. (I take it the situation you have in mind is that someone offers me to bet me at 10:1 odds that we’re in the Few People scenario rather than the Many People scenario, or something of the kind.) But, for what it’s worth, I think I take the P.P.’s side, while darkly suspecting I’m making a serious mistake :-). I repeat that my thinking on this stuff is all much less than half-baked.
I realized this, but I think my mind attached some of snarles’ statements to you.
I agree with your choice in the presumptuous philosopher problem, but I doubt that anyone could actually be in such an epistemic state, basically because the Few People cannot be sure that there is not another universe with completely different laws of physics simulating many copies of theirs, as well as many other qualitatively similar possible scenarios.
By defining the utility function in terms of the universe, rather than a subjective-experience path. Anthropics are a problem for humans because our utility functions are defined in terms of a “self” which is defined in a way that does not generalize well; for an AI, this would be a problem for us writing the utility function to give it, but not for the AI doing the optimization.
There are two utility functions here, and the AI isn’t optimizing the paperclip one.
This premise is unnecessary, since that possibility can be folded into the probability model.
So if the AI were building a robot that had to bet in a presumptuous philosopher or doomsday scenario, how would it bet in each? You do already have the right answer to sleeping beauty.
I used a ridiculously bad example. I was trying to ask what would happen if the AI considered the possibility that tiling the universe with paperclips would actually satisfy its own utility function. That is very implausible, but that just means that if a superintelligence were to decide to do so, it would have a very good reason.
No, you would defect on the true prisoners dilemma.
That would depend on what the AI hoped to achieve by building the robot. It seems to me that specifying that clearly should determine what approach the AI would want the robot to take in such situations.
(More generally, it seems to me that a lot of anthropic puzzles go away if one eschews indexicals. Whether that’s any use depends on how well one can do without indexicals. I can’t help suspecting that the Right Answer may be “perfectly well, and things that can only be said with indexicals aren’t really coherent”, which would be … interesting. In case it’s not obvious, everything in this paragraph is quarter-baked at best and may not actually make any sense. I think I recall that someone else posted something in LW-discussion recently with a similar flavour.)
I think I agree with everything written here, so at least your naive decision theory looks like it can handle anthropic problems.
Which side of the presumptuous philosopher’s bet would you take?
Just to clarify, I am not the author of the original article; I haven’t proposed any particular naive decision theory. (I don’t think “treat indexicals as suspect” qualifies as one!)
Unfortunately, my mind is not the product of a clearthinking AI and my values do (for good or ill) have indexically-defined stuff in them: I care more about myself than about random other people, for instance. That may or may not be coherent, but it’s how I am. And so I don’t know what the “right” way to get indexicals out of the rest of my thinking would be. And so I’m not sure how I “should” bet in the presumptuous philosopher case. (I take it the situation you have in mind is that someone offers me to bet me at 10:1 odds that we’re in the Few People scenario rather than the Many People scenario, or something of the kind.) But, for what it’s worth, I think I take the P.P.’s side, while darkly suspecting I’m making a serious mistake :-). I repeat that my thinking on this stuff is all much less than half-baked.
I realized this, but I think my mind attached some of snarles’ statements to you.
I agree with your choice in the presumptuous philosopher problem, but I doubt that anyone could actually be in such an epistemic state, basically because the Few People cannot be sure that there is not another universe with completely different laws of physics simulating many copies of theirs, as well as many other qualitatively similar possible scenarios.