(This is some of what I tried to say yesterday, but I was very tried and not sure I said it well)
Hm, the way I understand UDT, is that you give yourself the power to travel back in logical time. This means that you don’t need to actually make commitment early in your life when you are less smart.
If you are faced with blackmail or transparent Newcomb’s problem, or something like that, where you realise that if you had though of the possibility of this sort of situation before it happened (but with your current intelligence), you would have pre-committed to something, then you should now do as you would have pre-committed to.
This means that an UDT don’t have to do tons of pre-commitments. It can figure things out as it goes, and still get the benefit of early pre-committing. Though as I said when we talked, it does loose some transparency which might be very costly in some situations. Though I do think that you loose transparency in general by being smart, and that it is generally worth it.
(Now something I did not say)
However the there is one commitment that you (maybe?[1]) have to do to get the benefit of UDT if you are not already UDT, which is to commit to become UDT. And I get that you are wary of commitments.
Though more concretely, I don’t see how UDT can lead to worse behaviours. Can you give an example? Or do you just mean that UDT get into commitment races at all, which is bad? But I don’t know any DT that avoids this, other than always giving in to blackmail and bullies, which I already know you don’t, given one of the stories in the blogpost.
[1] Or maybe not. Is there a principled difference between never giving into blackmail becasue you pre-committed something, or just never giving into blackmail with out any binding pre-commitment? I suspect not really, which means you are UDT as long as you act UDT, and no pre-commitment needed, other than for your own sake.
where you realise that if you had though of the possibility of this sort of situation before it happened (but with your current intelligence), you would have pre-committed to something, then you should now do as you would have pre-committed to.
The difficulty is in how you spell out that hypothetical. What does it mean to think about this sort of situation before it happened but with your current intelligence? Your current intelligence includes lots of wisdom you’ve accumulated, and in particular, includes the wisdom that this sort of situation has happened, and more generally that this sort of situation is likely, etc. Or maybe it doesn’t—but then how do we define current intelligence then? What parts of your mind do we cut out, to construct the hypothetical?
I’ve heard of various ways of doing this and IIRC none of them solved the problem, they just failed in different ways. But it’s been a while since I thought about this.
One way they can fail is by letting you have too much of your current wisdom in the hypothetical, such that it becomes toothless—if your current wisdom is that people threatening you is likely, you’ll commit to giving in instead of resisting, so you’ll be a coward and people will bully you. Another way they can fail is by taking away too much of your current wisdom in the hypothetical, so that you commit to stupid-in-retrospect things too often.
Imagine your life as a tree (as in data structure). Every observation which (from your point of view of prior knowledge) could have been different, and every decision which (from your point of view) could have been different, is a node in this tree.
Ideally you would would want to pre-analyse the entire tree, and decide the optimal pre-commitment for each situation. This is too much work.
So instead you wait and see which branch you find yourself in, only then make the calculations needed to figure out what you would do in that situation, given a complete analysis of the tree (including logical constraints, e.g. people predicting what you would have done, etc). This is UDT. In theory, I see no drawbacks with UDT. Except in practice UDT is also too much work.
What you actually do, as you say, is to rely on experience based heuristics. Experience based heuristics is much superior for computational efficiency, and will give you a leg up in raw power. But you will slide away from optimal DT, which will give you a negotiating disadvantage. Given that I think raw power is more important than negotiating advantage, I think this is a good trade-off.
The only situation where you want to rely more on DT principles, is in super important one-off situations, and you basically only get those in weird acausal trade situations. Like, you could frame us building a friendly AI as acausal trade, like Critch said, but that framing does not add anything useful.
And then there is things like this and this and this, which I don’t know how to think of. I suspect it breaks somehow, but I’m not sure how. And if I’m wrong, getting DT right might be the most important thing.
But in any normal situation, you will either have repeated games among several equals, where some coordination mechanism is just uncomplicatedly in everyone interest. Or your in a situation where one person just have much more power over the other one.
(This is some of what I tried to say yesterday, but I was very tried and not sure I said it well)
Hm, the way I understand UDT, is that you give yourself the power to travel back in logical time. This means that you don’t need to actually make commitment early in your life when you are less smart.
If you are faced with blackmail or transparent Newcomb’s problem, or something like that, where you realise that if you had though of the possibility of this sort of situation before it happened (but with your current intelligence), you would have pre-committed to something, then you should now do as you would have pre-committed to.
This means that an UDT don’t have to do tons of pre-commitments. It can figure things out as it goes, and still get the benefit of early pre-committing. Though as I said when we talked, it does loose some transparency which might be very costly in some situations. Though I do think that you loose transparency in general by being smart, and that it is generally worth it.
(Now something I did not say)
However the there is one commitment that you (maybe?[1]) have to do to get the benefit of UDT if you are not already UDT, which is to commit to become UDT. And I get that you are wary of commitments.
Though more concretely, I don’t see how UDT can lead to worse behaviours. Can you give an example? Or do you just mean that UDT get into commitment races at all, which is bad? But I don’t know any DT that avoids this, other than always giving in to blackmail and bullies, which I already know you don’t, given one of the stories in the blogpost.
[1] Or maybe not. Is there a principled difference between never giving into blackmail becasue you pre-committed something, or just never giving into blackmail with out any binding pre-commitment? I suspect not really, which means you are UDT as long as you act UDT, and no pre-commitment needed, other than for your own sake.
Thanks for the detailed reply!
The difficulty is in how you spell out that hypothetical. What does it mean to think about this sort of situation before it happened but with your current intelligence? Your current intelligence includes lots of wisdom you’ve accumulated, and in particular, includes the wisdom that this sort of situation has happened, and more generally that this sort of situation is likely, etc. Or maybe it doesn’t—but then how do we define current intelligence then? What parts of your mind do we cut out, to construct the hypothetical?
I’ve heard of various ways of doing this and IIRC none of them solved the problem, they just failed in different ways. But it’s been a while since I thought about this.
One way they can fail is by letting you have too much of your current wisdom in the hypothetical, such that it becomes toothless—if your current wisdom is that people threatening you is likely, you’ll commit to giving in instead of resisting, so you’ll be a coward and people will bully you. Another way they can fail is by taking away too much of your current wisdom in the hypothetical, so that you commit to stupid-in-retrospect things too often.
Imagine your life as a tree (as in data structure). Every observation which (from your point of view of prior knowledge) could have been different, and every decision which (from your point of view) could have been different, is a node in this tree.
Ideally you would would want to pre-analyse the entire tree, and decide the optimal pre-commitment for each situation. This is too much work.
So instead you wait and see which branch you find yourself in, only then make the calculations needed to figure out what you would do in that situation, given a complete analysis of the tree (including logical constraints, e.g. people predicting what you would have done, etc). This is UDT. In theory, I see no drawbacks with UDT. Except in practice UDT is also too much work.
What you actually do, as you say, is to rely on experience based heuristics. Experience based heuristics is much superior for computational efficiency, and will give you a leg up in raw power. But you will slide away from optimal DT, which will give you a negotiating disadvantage. Given that I think raw power is more important than negotiating advantage, I think this is a good trade-off.
The only situation where you want to rely more on DT principles, is in super important one-off situations, and you basically only get those in weird acausal trade situations. Like, you could frame us building a friendly AI as acausal trade, like Critch said, but that framing does not add anything useful.
And then there is things like this and this and this, which I don’t know how to think of. I suspect it breaks somehow, but I’m not sure how. And if I’m wrong, getting DT right might be the most important thing.
But in any normal situation, you will either have repeated games among several equals, where some coordination mechanism is just uncomplicatedly in everyone interest. Or your in a situation where one person just have much more power over the other one.