Groan! Of all the Omega crap, this is the craziest. Can anyone explain to me, why should anyone ever contemplate this impossible scenario? Don’t just vote down.
If you do not test a principle in wacky hypothetical situations that will never happen, then you run the risk of going by pure intuition by another name. Many people are not comfortable with that.
Well, that can test whether your compiler / language actually does anything when you declare i an unsigned int. Yes, there are some that will happily accept ‘unsigned’ and throw it away.
I’m comparing contemplating impossible scenarios to computer code that will never be executed because its execution depends on a condition that will never be true. Such code does nothing but takes time to write and storage space.
Say you want to test principle X (a principle of ethics or rationality or whatever you like) and see if it gets a good answer in every case. You have some choices: you can try to test every case; you can use the principle for a couple of weeks and see if it encourages you to leap off of anything tall or open fire on a daycare; you can come up with a couple dozen likely situations that might call for a principle like the one you have in mind and see if it does all right; or you can do your absolute best to destroy that principle and find a situation somewhere, somehow, where it flounders and dies and tells you that what you really should do is wear a colander on your head and twirl down the street singing Gilbert and Sullivan transposed into the Mixolydian.
Weird situations like those in which Omega is invoked are attempts at the last, which is usually the strategy quickest to turn up a problem with a given principle (even if the counterexample is actually trivial). The “attempt to destroy” method is effective because it causes you to concentrate on the weak points of the principle itself, instead of being distracted by other confounding factors and conveniences.
The various Newcombe situations have fairly direct analogues in everyday things like ultimatum situations, or promise keeping. They alter it to reduce the number of variables, so the “certainty of trusting other party” dial gets turned up to 100% of Omega, “expectation of repeat” to 0 etc, in order to evaluate how to think of such problems when we cut out certain factors.
That said, I’m not actually sure what this question has to do with Newcombe’s paradox / counterfactual mugging, or what exactly is interesting about it. If it’s just asking “what information do you use to calculate the probability you plug into the EU calculation?” and Newcombe’s paradox is just being used as one particular example of it, I’d say that the obvious answer is “the probability you believe it is now.” After all, that’s going to already be informed by your past estimates, and any information you have available (such as that community of rationalists and their estimates). If the question is something specific to Newcombe’s paradox, I’m not getting it.
True, but note as a caveat the problems many ethicists have in recent years brought up involving thought experiments.
For example, if our concepts are fuzzy, we should expect our rules about the concepts to output fuzzy answers. Testing boundary cases might in that case not be helpful, as the distinctions between concepts might fall apart.
A subproblem of Friendly AI, or at least a similar problem, is the challenge of proving that properties of an algorithm are stable under self-modification. If we don’t identify a provably optimal algorithm for maximizing expected utility in decision-dependent counterfactuals, it’s hard to predict how the AI will decide to modify its decision procedure, and it’s harder to prove invariants about it.
Also, if someone else builds a rival AI, you don’t want it to able to trick your AI into deciding to self-destruct by setting up a clever Omega-like situation.
Because the point of a self-modifying AI is that it will be able to self-modify in situations we don’t anticipate. Being able to predict its self-modification in principle is useful precisely because we can’t hard-code every special case.
Groan! Of all the Omega crap, this is the craziest. Can anyone explain to me, why should anyone ever contemplate this impossible scenario? Don’t just vote down.
If you do not test a principle in wacky hypothetical situations that will never happen, then you run the risk of going by pure intuition by another name. Many people are not comfortable with that.
But they will never happen! That’s like… like
void f(unsigned int i) { if ( i < 0) throw “Invalid argument.”; }
!
What principles are being tested here?
Well, that can test whether your compiler / language actually does anything when you declare i an unsigned int. Yes, there are some that will happily accept ‘unsigned’ and throw it away.
Perhaps I could explain in a more helpful manner if I could understand your oddly-punctuated remark there.
An unsigned integer can’t have a minus sign, so it can’t be less than 0. Programmer talk.
I’m comparing contemplating impossible scenarios to computer code that will never be executed because its execution depends on a condition that will never be true. Such code does nothing but takes time to write and storage space.
Okay...
Say you want to test principle X (a principle of ethics or rationality or whatever you like) and see if it gets a good answer in every case. You have some choices: you can try to test every case; you can use the principle for a couple of weeks and see if it encourages you to leap off of anything tall or open fire on a daycare; you can come up with a couple dozen likely situations that might call for a principle like the one you have in mind and see if it does all right; or you can do your absolute best to destroy that principle and find a situation somewhere, somehow, where it flounders and dies and tells you that what you really should do is wear a colander on your head and twirl down the street singing Gilbert and Sullivan transposed into the Mixolydian.
Weird situations like those in which Omega is invoked are attempts at the last, which is usually the strategy quickest to turn up a problem with a given principle (even if the counterexample is actually trivial). The “attempt to destroy” method is effective because it causes you to concentrate on the weak points of the principle itself, instead of being distracted by other confounding factors and conveniences.
I get what you’re saying.
What principle is being tested here right now?
The various Newcombe situations have fairly direct analogues in everyday things like ultimatum situations, or promise keeping. They alter it to reduce the number of variables, so the “certainty of trusting other party” dial gets turned up to 100% of Omega, “expectation of repeat” to 0 etc, in order to evaluate how to think of such problems when we cut out certain factors.
That said, I’m not actually sure what this question has to do with Newcombe’s paradox / counterfactual mugging, or what exactly is interesting about it. If it’s just asking “what information do you use to calculate the probability you plug into the EU calculation?” and Newcombe’s paradox is just being used as one particular example of it, I’d say that the obvious answer is “the probability you believe it is now.” After all, that’s going to already be informed by your past estimates, and any information you have available (such as that community of rationalists and their estimates). If the question is something specific to Newcombe’s paradox, I’m not getting it.
I think this one is actually doing it backwards—“here are some wacky situations, someone come up with a principle that works here”.
“This is the wacky situation that breaks the best current candidate method. Can you fix it?”
True, but note as a caveat the problems many ethicists have in recent years brought up involving thought experiments.
For example, if our concepts are fuzzy, we should expect our rules about the concepts to output fuzzy answers. Testing boundary cases might in that case not be helpful, as the distinctions between concepts might fall apart.
Of course. Like most things, it’s not unanimously agreed upon.
A subproblem of Friendly AI, or at least a similar problem, is the challenge of proving that properties of an algorithm are stable under self-modification. If we don’t identify a provably optimal algorithm for maximizing expected utility in decision-dependent counterfactuals, it’s hard to predict how the AI will decide to modify its decision procedure, and it’s harder to prove invariants about it.
Also, if someone else builds a rival AI, you don’t want it to able to trick your AI into deciding to self-destruct by setting up a clever Omega-like situation.
If we can predict to how an AI would modify itself, why don’t we just write an already modified AI?
Because the point of a self-modifying AI is that it will be able to self-modify in situations we don’t anticipate. Being able to predict its self-modification in principle is useful precisely because we can’t hard-code every special case.