If the genie merely alters the present to conform to your wishes, you can easily run into unintended consequences.
The other problem is that divining someone’s intent is tricky business. A person often has a dozen impulses at cross-purposes to one another and the interpretation of your wish will likely vary depending on how much sleep you got and what you had for lunch. There’s a sci-fi short story Oddy and Id that examines a curious case of a man with luck so amazing that the universe bends to satisfy him. I won’t spoil it, but I think it brings up a relevant point.
Even when making requests of other people, they may fulfil them in ways you would prefer they hadn’t. The more powerful the genie is at divining your true intent, the more powerfully it can find ways of fulfilling your wishes that may not be what you want. It is not obvious that there is a favorable limit to this process.
Your answers to questions about your intent may depend on the order the questions are asked. Or they may depend on what knowledge you have, and if you study different things you may come up with different answers. Given a sufficiently powerful genie, there is no real entity that is “how I interpret the wish”.
How is the genie supposed to know your answers to all possible questions of interpretation? Large parts of “your interpretation” may not exist until you are asked about some hypothetical circumstance. Even if you are able to answer every such question, how is the genie to know the answer without asking you? Only by having a model of you sufficiently exact that you are confident it will give the same answers you would, even to questions you have not thought of and would have a hard time answering. But that is wishing for the genie to do all the work of being you.
A lot of transhumanist dreams seem to reduce to this: a Friendly AGI will do for us all the work of being us.
Your answers to questions about your intent may depend on the order the questions are asked. Or they may depend on what knowledge you have, and if you study different things you may come up with different answers.
If I ask the genie for long life, and the genie is forced to decide between a 200 year lifespan with a 20% chance of a painful death and a 201 year lifespan with a 21% chance of a painful death, it is possible that the genie might not get my preferences exactly correct, or that my preferences between those two results may depend on how I am asked or how I am feeling at the time.
But if the genie messed up and picked the one that didn’t really match my preferences, I would only be slightly displeased. I observe that this goes together: in cases where it would be genuinely hard or impossible for the genie to figure out what I prefer, the fact that the genie might not get my preferences correct only bothers me a little. In cases where extrapolating my preferences is much easier, the genie getting them wrong would matter to me a lot more (I would really not like a genie that grants my wish for long life by turning me into a fungal colony). So just because the genie can’t know the answer to every question about my extrapolated preferences doesn’t mean that the genie can’t know it to a sufficient degree that I would consider the genie good to ask for wishes.
What if you never think about the interpretation? Or is this how you would interpret them? Define would, then.
If you think about the interpretation, then you can already explain it. The problem is because you don’t actually think about every aspect and possibility while wishing.
Even if you never think about the interpretation, most aspects of wishes will have an implicit interpretation based on your values. You may never have thought about whether wishing for long life should turn you into a fungal colony, but if you had been asked “does your wish for long life mean you’d want to be turned into a fungal colony”, you’d have said “no”.
It’s incredibly hard to specify things unambigiously. Even in common workday practice, communication problems cause tons of problems; you always have to make assumptions, because absolutely precise definition of everything is extremely wasteful (if it’s even possible at all). I cringe whenever someone says “But that’s obvious! You should have thought of that automatically!”. Obviously, their model of reality (wherein I am aware of that particular thingy) is flawed, since I was not.
That’s the largest problem when delegating any work, IMO—we all have different preconceptions, and you can’t expect anyone else to share all those relevant to any given task. At least anything more complicated than pure math :D
Let us know when you can encode what “physically or mentally hurt or injure any of the participants” means in an actual existing programming language of your choice. :-)
Is there a safe way to wish for an unsafe genie to behave like a safe genie? That seems like a wish TOSWP should work on.
If you can rigorously define Safety, you’ve already solved the Safety Problem. This isn’t a shortcut.
I wish for you to interpret my wishes how I interpret them.
Can anyone find a problem with that?
If the genie merely alters the present to conform to your wishes, you can easily run into unintended consequences.
The other problem is that divining someone’s intent is tricky business. A person often has a dozen impulses at cross-purposes to one another and the interpretation of your wish will likely vary depending on how much sleep you got and what you had for lunch. There’s a sci-fi short story Oddy and Id that examines a curious case of a man with luck so amazing that the universe bends to satisfy him. I won’t spoil it, but I think it brings up a relevant point.
Even when making requests of other people, they may fulfil them in ways you would prefer they hadn’t. The more powerful the genie is at divining your true intent, the more powerfully it can find ways of fulfilling your wishes that may not be what you want. It is not obvious that there is a favorable limit to this process.
Your answers to questions about your intent may depend on the order the questions are asked. Or they may depend on what knowledge you have, and if you study different things you may come up with different answers. Given a sufficiently powerful genie, there is no real entity that is “how I interpret the wish”.
How is the genie supposed to know your answers to all possible questions of interpretation? Large parts of “your interpretation” may not exist until you are asked about some hypothetical circumstance. Even if you are able to answer every such question, how is the genie to know the answer without asking you? Only by having a model of you sufficiently exact that you are confident it will give the same answers you would, even to questions you have not thought of and would have a hard time answering. But that is wishing for the genie to do all the work of being you.
A lot of transhumanist dreams seem to reduce to this: a Friendly AGI will do for us all the work of being us.
If I ask the genie for long life, and the genie is forced to decide between a 200 year lifespan with a 20% chance of a painful death and a 201 year lifespan with a 21% chance of a painful death, it is possible that the genie might not get my preferences exactly correct, or that my preferences between those two results may depend on how I am asked or how I am feeling at the time.
But if the genie messed up and picked the one that didn’t really match my preferences, I would only be slightly displeased. I observe that this goes together: in cases where it would be genuinely hard or impossible for the genie to figure out what I prefer, the fact that the genie might not get my preferences correct only bothers me a little. In cases where extrapolating my preferences is much easier, the genie getting them wrong would matter to me a lot more (I would really not like a genie that grants my wish for long life by turning me into a fungal colony). So just because the genie can’t know the answer to every question about my extrapolated preferences doesn’t mean that the genie can’t know it to a sufficient degree that I would consider the genie good to ask for wishes.
What if you never think about the interpretation? Or is this how you would interpret them? Define would, then.
If you think about the interpretation, then you can already explain it. The problem is because you don’t actually think about every aspect and possibility while wishing.
Even if you never think about the interpretation, most aspects of wishes will have an implicit interpretation based on your values. You may never have thought about whether wishing for long life should turn you into a fungal colony, but if you had been asked “does your wish for long life mean you’d want to be turned into a fungal colony”, you’d have said “no”.
A sufficiently powerful genie might make safe genies by definition more unsafe. Then your wish could be granted.
edit (2015) caution: I think this particular comment is harmless in retrospect… but I wouldn’t give it much weight
I would create the machine (genie) to respond only in ways that cannot physically or mentally hurt or injure any of the participants.
I don’t think you quite understood the article :P
It’s incredibly hard to specify things unambigiously. Even in common workday practice, communication problems cause tons of problems; you always have to make assumptions, because absolutely precise definition of everything is extremely wasteful (if it’s even possible at all). I cringe whenever someone says “But that’s obvious! You should have thought of that automatically!”. Obviously, their model of reality (wherein I am aware of that particular thingy) is flawed, since I was not.
That’s the largest problem when delegating any work, IMO—we all have different preconceptions, and you can’t expect anyone else to share all those relevant to any given task. At least anything more complicated than pure math :D
Let us know when you can encode what “physically or mentally hurt or injure any of the participants” means in an actual existing programming language of your choice. :-)