I think because Eliezer wanted to ensure a good chance that right_Eliezer and right_random_human turn out to be very similar. If you let each person choose how to extrapolate using their own current ideas, you’re almost certainly going to end up with very different extrapolated moralities.
The point is not that they’ll be different, but that mistakes will be made, making the result not quite right, or more likely not right at all. So on the early stage, one must be very careful, develop a reliable theory of how to proceed instead of just doing stuff at random, or rather according to current human heuristics.
Extended amount of reflection looks like one least invasive self-improvement technique, something that’s expected to make you more reliably right, especially if you’re given opportunity to decide how the process is to be set up. This could get us to the next stage, and so on. More invasive heuristics can prove too disruptive, wrong in unexpected and poorly-understood ways, so that one won’t be able to expect the right outcome without close oversight from a moral judgment, which we don’t have in any technically strong enough form as of yet.
Suppose you have the intuition that extended reflection and coherence are good heuristics to guide your extrapolation. I, on the other hand, think that extended reflection as a base human is dangerous, and coherence has nothing to do with what’s right. I’d rather that the extrapolated me experiment with self-modification after only a moderate amount of theorizing, and at the end merge with its counter-factual versions through acausal negotiation.
Suppose further that you end up in control of FAI design, and you want it to take my morality into account. Would you have it extrapolate me using your preferred method, or mine?
Suppose you have the intuition that extended reflection and coherence are good heuristics to guide your extrapolation. I, on the other hand, think that extended reflection as a base human is dangerous, and coherence has nothing to do with what’s right.
What these heuristics discuss are ways of using more resources. The resources themselves are heuristically assumed to be useful, and so we discuss how to use them best.
(Now to slip to an object-level argument for a change.)
Notice the “especially if you’re given opportunity to decide how the process is to be set up” in my comment. I agree that unnaturaly extended reflection is dangerous, we might even run into physiological problems with computations in the brains that are too chronologically old. But 50 years is better that 6 months, even if both 50 years and 6 months are dangerous. And if you actually work on planning these reflection sessions, so that you can set up groups of humans to work for some time, then maybe resetting them and only having them pass their writings to new humans, filtering such findings using not-older-than-50 humans trained on more and more improved findings and so on. For most points you could raise with the reason it’s dangerous, we could work on finding a solution for that problem. For any experiment with FAI design, we would be better off thinking about it first.
Likewise, if you task 1000 groups of humans to work on coming up with possible strategies for using the next batch of computational resources (not for doing most good explicitly, but for developing even better heuristic understanding of the problem), and you use the model of human research groups as having a risk of falling into reflective death spirals where all members of a group can fall to memetic infection that gives no answers to the question they considered, then it seems like a good heuristic to place considerably less weight on suggestions that come up very rarely and don’t get supported by some additional vetting process.
For example, the first batches of research could focus on developing effective training programs in rationality, then in social engineering, voting schemes, and so on. Overall architecture of future human-level meta-ethics necessary for more dramatic self-improvement (or improvement in the methods of having things done, such as using a non-human AI or science of deep non-human moral calculations) would come much later.
In short, I’m not talking of anything that opposes the strategies you named, so you’d need to point to incurable problems that make the strategy of thinking more about the problem lead to worse results than randomly making stuff up (sorry!).
and at the end merge with its counter-factual versions through acausal negotiation.
The current understanding of acausal control (which is a consideration from decision theory, which can in turn be seen as normative element of a meta-ethics, which is the same kind of consideration as “let’s reflect more”) is inadequate to place the weight of the future on a statement like this. We need to think more about decision theory, in particular, before making such decisions.
Suppose further that you end up in control of FAI design
What does it mean? If I can order a computer around, that doesn’t allow me to know what to do with it.
and you want it to take my morality into account. Would you have it extrapolate me using your preferred method, or mine?
I’d think about the problem more, or try implementing a reliable process for that if I can.
I think because Eliezer wanted to ensure a good chance that right_Eliezer and right_random_human turn out to be very similar. If you let each person choose how to extrapolate using their own current ideas, you’re almost certainly going to end up with very different extrapolated moralities.
The point is not that they’ll be different, but that mistakes will be made, making the result not quite right, or more likely not right at all. So on the early stage, one must be very careful, develop a reliable theory of how to proceed instead of just doing stuff at random, or rather according to current human heuristics.
Extended amount of reflection looks like one least invasive self-improvement technique, something that’s expected to make you more reliably right, especially if you’re given opportunity to decide how the process is to be set up. This could get us to the next stage, and so on. More invasive heuristics can prove too disruptive, wrong in unexpected and poorly-understood ways, so that one won’t be able to expect the right outcome without close oversight from a moral judgment, which we don’t have in any technically strong enough form as of yet.
Suppose you have the intuition that extended reflection and coherence are good heuristics to guide your extrapolation. I, on the other hand, think that extended reflection as a base human is dangerous, and coherence has nothing to do with what’s right. I’d rather that the extrapolated me experiment with self-modification after only a moderate amount of theorizing, and at the end merge with its counter-factual versions through acausal negotiation.
Suppose further that you end up in control of FAI design, and you want it to take my morality into account. Would you have it extrapolate me using your preferred method, or mine?
What these heuristics discuss are ways of using more resources. The resources themselves are heuristically assumed to be useful, and so we discuss how to use them best.
(Now to slip to an object-level argument for a change.)
Notice the “especially if you’re given opportunity to decide how the process is to be set up” in my comment. I agree that unnaturaly extended reflection is dangerous, we might even run into physiological problems with computations in the brains that are too chronologically old. But 50 years is better that 6 months, even if both 50 years and 6 months are dangerous. And if you actually work on planning these reflection sessions, so that you can set up groups of humans to work for some time, then maybe resetting them and only having them pass their writings to new humans, filtering such findings using not-older-than-50 humans trained on more and more improved findings and so on. For most points you could raise with the reason it’s dangerous, we could work on finding a solution for that problem. For any experiment with FAI design, we would be better off thinking about it first.
Likewise, if you task 1000 groups of humans to work on coming up with possible strategies for using the next batch of computational resources (not for doing most good explicitly, but for developing even better heuristic understanding of the problem), and you use the model of human research groups as having a risk of falling into reflective death spirals where all members of a group can fall to memetic infection that gives no answers to the question they considered, then it seems like a good heuristic to place considerably less weight on suggestions that come up very rarely and don’t get supported by some additional vetting process.
For example, the first batches of research could focus on developing effective training programs in rationality, then in social engineering, voting schemes, and so on. Overall architecture of future human-level meta-ethics necessary for more dramatic self-improvement (or improvement in the methods of having things done, such as using a non-human AI or science of deep non-human moral calculations) would come much later.
In short, I’m not talking of anything that opposes the strategies you named, so you’d need to point to incurable problems that make the strategy of thinking more about the problem lead to worse results than randomly making stuff up (sorry!).
The current understanding of acausal control (which is a consideration from decision theory, which can in turn be seen as normative element of a meta-ethics, which is the same kind of consideration as “let’s reflect more”) is inadequate to place the weight of the future on a statement like this. We need to think more about decision theory, in particular, before making such decisions.
What does it mean? If I can order a computer around, that doesn’t allow me to know what to do with it.
I’d think about the problem more, or try implementing a reliable process for that if I can.