Should I take it as an admission that you don’t actually know whether to choose torture over dust specks, and would rather delegate this question to the FAI?
Should I take it as an admission that you don’t actually know whether to choose torture over dust specks, and would rather delegate this question to the FAI?
All moral questions should be delegated to FAI, whenever that’s possible, but this is trivially so and doesn’t address the questions.
What I’ll choose will be based on some mix of moral intuition, heuristics about the utilitarian shape of morality, and expected utility estimates. But that would be a matter of making the decision, not a matter of obtaining interesting knowledge about the actual answers to the moral questions.
I don’t know whether torture or specks are preferable, I can offer some arguments that torture is better, and some arguments that specks are better, but that won’t give much hope for eventually figuring out the truth, unlike with the more accessible questions in natural science, like the speed of light. I can say that if given the choice, I’d choose torture, based on what I know, but I’m not sure it’s the right choice and I don’t know of any promising strategy for learning more about which choice is the right one. And thus I’d prefer to leave such questions alone, so long as the corresponding decisions don’t need to be actually made.
I don’t see what these thought experiments can teach me.
As it happened several times before, you seem to take as obvious some things that I don’t find obvious at all, and which would make nice discussion topics for LW.
How can you tell that some program is a fair extrapolation of your morality? If we create a program that gives 100% correct answers to all “realistic” moral questions that you deal with in real life, but gives grossly unintuitive and awful-sounding answers to many “unrealistic” moral questions like Torture vs Dustspecks or the Repugnant Conclusion, would you force yourself to trust it over your intuitions? Would it help if the program were simple? What else?
I admit I’m confused on this issue, but feel that our instinctive judgements about unrealistic situations convey some non-zero information about our morality that needs to be preserved, too. Otherwise the FAI risks putting us all into a novel situation that we will instinctively hate.
How can you tell that some program is a fair extrapolation of your morality?
This is the main open question of FAI theory. (Although FAI doesn’t just extrapolate your revealed reliable moral intuitions, it should consider at least the whole mind as source data.)
If we create a program that gives 100% correct answers to all “realistic” moral questions that you deal with in real life, but gives grossly unintuitive and awful-sounding answers to many “unrealistic” moral questions like Torture vs Dustspecks or the Repugnant Conclusion, would you force yourself to trust it over your intuitions?
I don’t suppose agreeing on more reliable moral questions is an adequate criterion (sufficient condition), though I’d expect agreement on such questions to more or less hold. FAI needs to be backed by solid theory, explaining why exactly its answers are superior to moral intuition. That theory is what would force one to accept even counter-intuitive conclusions. Of course, one should be careful not to be fooled by a wrong theory, but being fooled by your own moral intuition is also always a possibility.
I admit I’m confused on this issue, but I feel that our instinctive judgments about unrealistic situations convey some non-zero information about our morality that needs to be preserved, too.
Maybe they do, but how much would you expect to learn about quasars from observations made by staring at the sky with your eyes?
We need better methods that don’t involve relying exclusively on vanilla moral intuitions. What kinds of methods would work, I don’t know, but I do know that moral intuition is not the answer. FAI refers to successful completion of this program, and so represents the answers more reliable than moral intuition.
FAI needs to be backed by solid theory, explaining why exactly its answers are superior to moral intuition.
If by “solid” you mean “internally consistent”, there’s no need to wait—you should adopt expected utilitarianism now and choose torture. If by “solid” you mean “agrees with our intuitions about real life”, we’re back to square one. If by “solid” you mean something else, please explain what exactly. It looks to me like you’re running circles around the is-ought problem without recognizing it.
If by “solid” you mean “internally consistent”, there’s no need to wait—you should adopt expected utilitarianism now and choose torture.
How could I possibly mean “internally consistent”? Being consistent conveys no information about a concept, aside from its non-triviality, and so can’t be a useful characteristic. And choosing specks is also “internally consistent”. Maybe I like specks in others’ eyes.
FAI theory should be reliably convincing and verifiable, preferably on the level of mathematical proofs. FAI theory describes how to formally define the correct answers to moral questions, but doesn’t at all necessarily help in intuitive understanding of what these answers are. It could be a formalization of “what we’d choose if we were smarter, knew more, had more time to think”, for example, which doesn’t exactly show how the answers look.
FAI doesn’t work with “us”, it works with world-states, which include all detail including whatever distinguishes present humans from hypothetical smarter people. A given situation that includes a smarter person is distinct from otherwise the same situation that includes a human person, and so these situations should be optimized differently.
I see your point, but my question still stands. You seem to take it on faith that an extrapolated smarter version of humanity would be friendly to present-day humanity and wouldn’t want to put it in unpleasant situations, or that they would and it’s “okay”. This is not quite as bad as believing that a paperclipper AI will “discover” morality on its own, but it’s close.
You seem to take it on faith that a hypothetical smarter version of humanity would be friendly to present-day humanity and wouldn’t want to put it in unpleasant situations, or that they would and it’s “okay”.
I don’t “take it on faith”, and the example with “if we were smarter” wasn’t supposed to be an actual stab at FAI theory.
On the other hand, if we define “smarter” as also keeping preference fixed (the alternative would be wrong, as a Smiley is also “smarter”, but clearly not what I meant), then smarter versions’ advice is by definition better. This, again, gives no technical guidance on how to get there, hence “formalization” word was essential in my comment. The “smarter” modifier is about as opaque as the whole of FAI.
You define “smarter” as keeping “preference” fixed, but you also define “preference” as the extrapolation of our moral intuitions as we become “smarter”. It’s circular. You’re right, this stuff is opaque.
Should I take it as an admission that you don’t actually know whether to choose torture over dust specks, and would rather delegate this question to the FAI?
All moral questions should be delegated to FAI, whenever that’s possible, but this is trivially so and doesn’t address the questions.
What I’ll choose will be based on some mix of moral intuition, heuristics about the utilitarian shape of morality, and expected utility estimates. But that would be a matter of making the decision, not a matter of obtaining interesting knowledge about the actual answers to the moral questions.
I don’t know whether torture or specks are preferable, I can offer some arguments that torture is better, and some arguments that specks are better, but that won’t give much hope for eventually figuring out the truth, unlike with the more accessible questions in natural science, like the speed of light. I can say that if given the choice, I’d choose torture, based on what I know, but I’m not sure it’s the right choice and I don’t know of any promising strategy for learning more about which choice is the right one. And thus I’d prefer to leave such questions alone, so long as the corresponding decisions don’t need to be actually made.
I don’t see what these thought experiments can teach me.
As it happened several times before, you seem to take as obvious some things that I don’t find obvious at all, and which would make nice discussion topics for LW.
How can you tell that some program is a fair extrapolation of your morality? If we create a program that gives 100% correct answers to all “realistic” moral questions that you deal with in real life, but gives grossly unintuitive and awful-sounding answers to many “unrealistic” moral questions like Torture vs Dustspecks or the Repugnant Conclusion, would you force yourself to trust it over your intuitions? Would it help if the program were simple? What else?
I admit I’m confused on this issue, but feel that our instinctive judgements about unrealistic situations convey some non-zero information about our morality that needs to be preserved, too. Otherwise the FAI risks putting us all into a novel situation that we will instinctively hate.
This is the main open question of FAI theory. (Although FAI doesn’t just extrapolate your revealed reliable moral intuitions, it should consider at least the whole mind as source data.)
I don’t suppose agreeing on more reliable moral questions is an adequate criterion (sufficient condition), though I’d expect agreement on such questions to more or less hold. FAI needs to be backed by solid theory, explaining why exactly its answers are superior to moral intuition. That theory is what would force one to accept even counter-intuitive conclusions. Of course, one should be careful not to be fooled by a wrong theory, but being fooled by your own moral intuition is also always a possibility.
Maybe they do, but how much would you expect to learn about quasars from observations made by staring at the sky with your eyes?
We need better methods that don’t involve relying exclusively on vanilla moral intuitions. What kinds of methods would work, I don’t know, but I do know that moral intuition is not the answer. FAI refers to successful completion of this program, and so represents the answers more reliable than moral intuition.
If by “solid” you mean “internally consistent”, there’s no need to wait—you should adopt expected utilitarianism now and choose torture. If by “solid” you mean “agrees with our intuitions about real life”, we’re back to square one. If by “solid” you mean something else, please explain what exactly. It looks to me like you’re running circles around the is-ought problem without recognizing it.
How could I possibly mean “internally consistent”? Being consistent conveys no information about a concept, aside from its non-triviality, and so can’t be a useful characteristic. And choosing specks is also “internally consistent”. Maybe I like specks in others’ eyes.
FAI theory should be reliably convincing and verifiable, preferably on the level of mathematical proofs. FAI theory describes how to formally define the correct answers to moral questions, but doesn’t at all necessarily help in intuitive understanding of what these answers are. It could be a formalization of “what we’d choose if we were smarter, knew more, had more time to think”, for example, which doesn’t exactly show how the answers look.
Then the FAI risks putting us all in a situation we hate, which we’d love if only we were a bit smarter.
FAI doesn’t work with “us”, it works with world-states, which include all detail including whatever distinguishes present humans from hypothetical smarter people. A given situation that includes a smarter person is distinct from otherwise the same situation that includes a human person, and so these situations should be optimized differently.
I see your point, but my question still stands. You seem to take it on faith that an extrapolated smarter version of humanity would be friendly to present-day humanity and wouldn’t want to put it in unpleasant situations, or that they would and it’s “okay”. This is not quite as bad as believing that a paperclipper AI will “discover” morality on its own, but it’s close.
I don’t “take it on faith”, and the example with “if we were smarter” wasn’t supposed to be an actual stab at FAI theory.
On the other hand, if we define “smarter” as also keeping preference fixed (the alternative would be wrong, as a Smiley is also “smarter”, but clearly not what I meant), then smarter versions’ advice is by definition better. This, again, gives no technical guidance on how to get there, hence “formalization” word was essential in my comment. The “smarter” modifier is about as opaque as the whole of FAI.
You define “smarter” as keeping “preference” fixed, but you also define “preference” as the extrapolation of our moral intuitions as we become “smarter”. It’s circular. You’re right, this stuff is opaque.
It’s a description, connection between the terms, but not a definition (pretty useless, but not circular).