The scroll modifies your expectations.
The genie twist-interprets X, and then assesses your expectations of the result of the genie’s interpretation of X. (“Why, that’s just what you’d expect destroying the world to do! What are you complaining about?”)
The complete list of expectations regarding X is at least slightly self-contradictory, so of course the genie has no option except to modify your expectations directly...
The genie is, after all, all-powerful, so there are any number of subtle changes it could make that you didn’t specify against that would immediately make you, or someone else, wish for the world to be destroyed. If that’s the genie’s goal, you have no chance. Heck, if it can choose it’s form it could probably appear as some psycho-linguistic anomaly that hits your retina just right to make you into a person who would wish to end the world.
Really I’m just giving the genie a chance to show me that it’s a nice guy. If it’s super evil I’m doomed regardless, but this wish test (hopefully) distinguishes between a benevolent genie and one that’s going to just be a dick.
(A) a genie that’s going to be “just be a dick” but is not skilled at it ;
(B) a genie that is benevolent ;
(C) a genie that’s going to be “just be a dick” but is very skilled at it.
Your test will (may at least) tell apart A from (B or C). It won’t tell apart B from C.
The “there is no safe wish” rule applies to C. Sure, if your genie is not skilled a being “evil” (having an utility function very different from yours), you can craft a wish that is beyond the genie’s ability to twist it. But if the genie is skilled, much more intelligent than you are, with like the ability to spend the equivalent of one million of years of thinking how to twist the wish in one second, he’ll find a flaw and use it.
“I wish for a paper containing the exact wording of a wish that, when spoken to you, would meet all my expectations as of September 3, 2012, for a wish granting X.”
(Then, if my expectations yesterday did contain self-contradictions, the genie will do… whatever it did if I wished that 2 + 2 = 5.)
The scroll modifies your expectations. The genie twist-interprets X, and then assesses your expectations of the result of the genie’s interpretation of X. (“Why, that’s just what you’d expect destroying the world to do! What are you complaining about?”) The complete list of expectations regarding X is at least slightly self-contradictory, so of course the genie has no option except to modify your expectations directly...
OOoh, is this now the “eliezer points out how your wish would go wrong” thread! I wanna play to! :p
“I wish for that which I’d wish for if I had an uninterrupted year of thinking about it and freely talking to a dedicated copy of Eliezer Yudovsky”
Uh oh...
Eliezer Yudkowsky:
Eliezer Yud_ov_sky:
No sleep, or anything that would interrupt thinking about it, for a year, might lead to an interesting wish.
Well, it’s obvious what happens then: the genie lets a dedicated copy of Eliezer out of a box.
No sleep, or anything else that would mean not thinking about it, for a year. That
The genie is, after all, all-powerful, so there are any number of subtle changes it could make that you didn’t specify against that would immediately make you, or someone else, wish for the world to be destroyed. If that’s the genie’s goal, you have no chance. Heck, if it can choose it’s form it could probably appear as some psycho-linguistic anomaly that hits your retina just right to make you into a person who would wish to end the world.
Really I’m just giving the genie a chance to show me that it’s a nice guy. If it’s super evil I’m doomed regardless, but this wish test (hopefully) distinguishes between a benevolent genie and one that’s going to just be a dick.
If you consider three class of genies :
(A) a genie that’s going to be “just be a dick” but is not skilled at it ;
(B) a genie that is benevolent ;
(C) a genie that’s going to be “just be a dick” but is very skilled at it.
Your test will (may at least) tell apart A from (B or C). It won’t tell apart B from C.
The “there is no safe wish” rule applies to C. Sure, if your genie is not skilled a being “evil” (having an utility function very different from yours), you can craft a wish that is beyond the genie’s ability to twist it. But if the genie is skilled, much more intelligent than you are, with like the ability to spend the equivalent of one million of years of thinking how to twist the wish in one second, he’ll find a flaw and use it.
“I wish for a paper containing the exact wording of a wish that, when spoken to you, would meet all my expectations as of September 3, 2012, for a wish granting X.”
(Then, if my expectations yesterday did contain self-contradictions, the genie will do… whatever it did if I wished that 2 + 2 = 5.)