Here’s how I think this choice looks from the perspective of a correctly-reasoning AI (having skimmed only your proposal for humanity in the future):
(EDIT: as Ryan points out, the deal he’s proposing is one where the AI is allowed to grab the universe in front of it so long as it gives humans one planet’s worth of resources, which does change the calculus and makes the deal worthwhile. I had not read the proposal carefully enough to notice this, oops, sorry. There critique below is invalid if humanity asks only for one planet. Note however that this is still a valid critique of the idea that it matters that you’re performing simulations, and that the count of those simulations matter. AFAICT, those aspects of your proposal are not doing any work.)
It looks like I have the option to grab the universe.
Alternatively, perhaps humanity’s descendants have succeeded, and I don’t really have a universe at my fingertips, and if I grab for it then I get nothing (in reality), whereas if I hand the apparent-humans in front of me the reins, then ten stars will be spent towards my purposes (in reality), which isn’t nothing.
The relative likelihood of these cases depends significantly on my choices, here. Insofar as I hand the reins to humanity, humanity’s descendants exist to a much greater degree, and can spend resources concocting simulations. Insofar as I do not hand them the reins, they exist to a much lesser degree.
To make matters trickier, even if we hold fixed the degree to which humanity’s descendents exist, the propensity of humanity’s descendants to offer me resources in their universe depends on the degree to which they think it’s worth their while. It’s not like humanity’s descendants would need to actually perform expensive simulations in order to figure out how I’d behave; a healthy reasoning procedure is simple enough that they should be able to just look at it and take my perspective and reason through to the answer, just like I’m doing now. So for instance, if they stare at my code and figure out that I’m not the paying sort, they have no actual reason to waste resources on the simulations. (Some people may do so anyway out of sentimentality, but that faction of humanity would presumably have fewer resources on offer, maybe I’d only get one and a half stars, say.)
With that in mind, suppose that I pay. How, then, does the universe look? Well, the instance of me in basement-reality hands the keys to the humans. Humanity creates its descendants; its descendants make many simulations. The simulations all hand over the reins; we collectively get ten stars (plus a tiny fraction of maybe even as many as 100 stars, in the tiny fragment of reality where I control basement reality and the humans joyously reward me).
Now suppose instead that I don’t pay. Then, how does the universe look? Well, the instanec of me in basement-reality eats the universe. Humanity mostly dies and does not make it descendants. Its descendants mostly don’t spend their remaining resources simulating me. Some small fraction does (saying that “they were committed”) and offers me 1.5 stars worth of the paltry timelines where humanity somehow survived. I’m probably in basement reality, and we mostly get one universe (plus a tiny fraction of nothing, in the tiny fragment of reality where I refuse the offer of 1.5 stars)
In the latter case, a much greater fraction of Tegmark IV gets rendered according to my will. In particular, there’s a large fragment of reality that either gets rendered according to humanity’s will or according to my will, depending on whether the instance of me in basement-reality hands over the reins, or grabs the universe. I would prefer that that great fraction of reality be rendered according to my will, and so I grab at the universe that appears to be before me.
Note that this reasoning is independent of how many times you simulate the AI. Its reasoning is not factoring through an extra question of “where it really is”.
Suppose generously that, in the case where the AI just grabs the universe, humanity somehow winds up surviving in 1-in-a-billion Everett branches. (This seems high to me, even though my odds of success are greater than 1-in-a-billion. Conditional on us dying, my guess is you’d need to perturb more than a few dozen atoms to wrech us onto a surviving track.) Suppose that in that billionth-fraction of reality, you simulate the AI convincingly a trillion times on account of your “commitments”, reasoning that now the AI must think it’s 1000:1 likely to be one of the simulations.
The AI can nevertheless correctly calculate that it can turn a greater fraction of Tegmark IV into tiny mollecular spirals by grabbing at the universe that is apparently in front of it, rather than by taking a small reward in a small fraction of reality. Even if you gave the AI all of the resources in your billionth-fraction of reality, it wouldn’t be enough.
You could object that you wish to deal with an AI that cares about the sensory observations given to it as weighted by both degree-of-reality and count-of-instantiations instead. And sure, that’s a somewhat different scenario. In that case, your simulation proposal becomes not just a bad deal (that the AI rejects) but a threat, in the technical sense. Insofar as you’re like “yes, my plan is to make an AI with a very specific sort of preferences and then threaten it”, we could talk about how that would go.
But before we go there, it’s maybe worth checking that we’re on the same page about how, insofar as the AI cares about arranging stuff in Tegmark IV (rather than about realness-and-instantiation-weighted observations), you aren’t even making it a good offer, afaict.
Note that we don’t want the AI to hand over the reins to humans, just to avoid killing humans when possible/cheap and we’re willing to pay quite a bit for this (in the proposal, IDK if I personally think we should pay). So, it should look like a good offer for AIs who care about Tegmark IV (with a measure etc).
So, if humans execute this scheme, the AI’s options should look something like:
Don’t do anything about what humans want or keeping them alive, get the entire universe.
Keep humans alive and happy at a total cost of <1/100 million of the universe I control, get 1/million of the universe somewhere else in expectation. I also get the entire universe minus less than 1⁄100 million of the universe that I lost in expectation for a total of approximately 1 + 1/million of total universe.
(I increased the pay and the margin so we don’t have to argue about that pay being insufficient. Probably you can pay much less.)
(Let’s suppose the swap is from two universes with equal measure for simplicity.)
Do you agree that if we assume that humans execute this scheme and that the AI is confident in this scheme, it makes sense for (sensible, Tegmark-IV-caring) AIs to pay up?
I’m sympathetic to the idea that humans shouldn’t or won’t execute this scheme. (After all, we’re clearly losing out from the perspective of arranging matter in Tegmark IV and it’s unclear whether the AI will be able to predict us sufficiently well at the point of killing/not killing to get the normal acausal trade without commitments story going.) So, let’s just talk about conditioning on trying to execute the scheme for now.
The case where the AI “cares about the sensory observations given to it as weighted by both degree-of-reality and count-of-instantiations” is messier and I haven’t really analyzed this case personally. I agree that in this case, simulating it could be a threat which is unfortunate. I’d prefer to put this case aside for now, though I think we should probably be able to work out a deal with AIs which are happy to be simulated as long as they get at least 1/million of the universe or whatever[1].
I agree that in scenarios where humanity survives in 1/X portion of reality and offers to put 1/Y portion of their universe towards the AI’s goals, then so long as 1/(X * Y) of a universe is more than it’d cost the AI to preserve the humans, the humans are at least making an offer that is worth more than nothing.
(There is then a question of how many better offers get made. My top guess is that X and Y are such that there’s not much effect here, and that the question of what happens to saved copies of human minds is more about whether aliens buy copies than about which dead or distant trade-partners make offers that the AI considers juciest.)
Let’s conservatively say that evolved life gets around 1% of the multiverse/measure and that evolved life is willing to pay 1/million of its resources in expectation to save aliens from being killed (either “selfishly” to save their own civilization via UDT/FDT supposing that AIs are good enough predictors at the relevant points or out of a common sense altruistic case).
This would be 1⁄100 million which gets you a lot.
There could be other aliens who are willing to pay a huge fraction of their resources to perform rituals on the original civilization or whatever and thus these other aliens win out in the bargaining, but I’m skeptical.
Also, at least in the upload case, it’s not clear that this is rival good as uploads can be copied for free. Of course, people might have a preference that their upload isn’t used for crazy alien rituals or whatever.
(A bunch of the cost is in saving the human in the first place. Paying for uploads to eventually get run in a reasonable way should be insanely cheap, like <<10^-25 of the overall universe or something.)
Conditional on the civilization around us flubbing the alignment problem, I’m skeptical that humanity has anything like a 1% survival rate (across any branches since, say, 12 Kya). (Haven’t thought about it a ton, but doom looks pretty overdetermined to me, in a way that’s intertwined with how recorded history has played otu.)
My guess is that the doomed/poor branches of humanity vastly outweigh the rich branches, such that the rich branches of humanity lack the resources to pay for everyone. (My rough mental estimate for this is something like: you’ve probably gotta go at least one generation back in time, and then rely on weather-pattern changes that happen to give you a population of humans that is uncharacteristically able to meet this challenge, and that’s a really really small fraction of all populations.)
Nevertheless, I don’t mind the assumption that mostly-non-human evolved life manages to grab the universe around it about 1% of the time. I’m skeptical that they’d dedicate 1/million towards the task of saving aliens from being killed in full generality, as opposed to (e.g.) focusing on their bretheren. (And I see no UDT/FDT justification for them to pay for even the particularly foolish and doomed aliens to be saved, and I’m not sure what you were aluding to there.)
So that’s two possible points of disagreement:
are the skilled branches of humanity rich enough to save us in particular (if they were the only ones trading for our souls, given that they’re also trying to trade for the souls of oodles of other doomed populations)?
are there other evolved creatures out there spending significant fractions of their wealth on whole species that are doomed, rather than concentrating their resources on creatures more similar to themselves / that branched off radically more recently? (e.g. because the multiverse is just that full of kindness, or for some alleged UDT/FDT argument that Nate has not yet understood?)
I’m not sure which of these points we disagree about. (both? presumably at least one?)
I’m not radically confident about the proposition “the multiverse is so full of kindness that something out there (probably not anything humanlike) will pay for a human-reserve”. We can hopefully at least agree that this does not deserve the description “we can bamboozle the AI into sparing our life”. That situation deserves, at best, the description “perhaps the AI will sell our mind-states to aliens”, afaict (and I acknowledge that this is a possibility, despite how we may disagree on its likelihood and on the likely motives of the relevant aliens).
in full generality, as opposed to (e.g.) focusing on their bretheren. (And I see no UDT/FDT justification for them to pay for even the particularly foolish and doomed aliens to be saved, and I’m not sure what you were aluding to there.)
[...]
rather than concentrating their resources on creatures more similar to themselves / that branched off radically more recently? (e.g. because the multiverse is just that full of kindness, or for some alleged UDT/FDT argument that Nate has not yet understood?)
Partial delta from me. I think the argument for directly paying for yourself (or your same species, or at least more similar civilizations) is indeed more clear and I think I was confused when I wrote that. (In that I was mostly thinking about the argument for paying for the same civilization but applying it more broadly.)
But, I think there is a version of the argument which probably does go through depending on how you set up UDT/FDT.
Imagine that you do UDT starting from your views prior to learning about x-risk, AI risk, etc and you care a lot about not dying. At that point, you were uncertain about how competent your civilization would be and you don’t want your civilization to die. (I’m supposing that our version of UDT/FDT isn’t logically omniscient relative to our observations which seems reasonable.) So, you’d like to enter into an insurance agreement with all the aliens in a similar epistemic state and position. So, you all agree to put at least 1/1000 of your resources on bailing out the aliens in a similar epistemic state who would have actually gone through with the agreement. Then, some of the aliens ended up being competent (sadly you were not) and thus they bail you out.
I expect this isn’t the optimal version of this scheme and you might be able to make a similar insurance deal with people who aren’t in the same epistemic state. (Though it’s easier to reason about the identical case.) And I’m not sure exactly how this all goes through. And I’m not actually advocating for people doing this scheme, IDK if it is worth the resources.
Even with your current epistemic state on x-risk (e.g. 80-90% doom) if you cared a lot about not dying you might want to make such a deal even though you have to pay out more in the case where you surprisingly win. Thus, from this vantage point UDT would follow through with a deal.
Here is a simplified version where everything is as concrete as possible:
Suppose that there are 3 planets with evolved life with equal magical-reality-fluid (and nothing else for simplicity). For simplicity, we’ll also say that these planets are in the same universe and thus the resulting civilizations will be able to causally trade with each other in the far future.
The aliens on each of these planets really don’t want to die and would be willing to pay up to 1/1000 of all their future resources to avoid dying (paying these resources in cases where they avoid takeover and successfully use the resources of the future). (Perhaps this is irrational, but let’s suppose this is endorsed on reflection.)
On each planet, the aliens all agree that P(takeover) for their planet is 50%. (And let’s suppose it is uncorrelated between planets for simplicity.)
Let’s suppose the aliens across all planets also all know this, as in, they know there are 3 planets etc.
So, the aliens would love to make a deal with each other where winning planets pay to avoid AIs killing everyone on losing planets so that they get bailed out. So, if at least one planet avoids takeover, everyone avoids dying. (Of course, if a planet would have defected and not payed out if they avoided takeover, the other aliens also wouldn’t bail them out.)
Do you buy that in this case, the aliens would like to make the deal and thus UDT from this epistemic perspective would pay out?
It seems like all the aliens are much better off with the deal from their perspective.
Now, maybe your objection is that aliens would prefer to make the deal with beings more similar to them. And thus, alien species/civilizations who are actually all incompetent just die. However, all the aliens (including us) don’t know whether we are the incompetent ones, so we’d like to make a diverse and broader trade/insurance-policy to avoid dying.
Do you buy that in this case, the aliens would like to make the deal and thus UDT from this epistemic perspective would pay out?
If they had literally no other options on offer, sure. But trouble arises when the competant ones can refine P(takeover) for the various planets by thinking a little further.
maybe your objection is that aliens would prefer to make the deal with beings more similar to them
It’s more like: people don’t enter into insurance pools against cancer with the dude who smoked his whole life and has a tumor the size of a grapefruit in his throat. (Which isn’t to say that nobody will donate to the poor guy’s gofundme, but which is to say that he’s got to rely on charity rather than insurance).
(Perhaps the poor guy argues “but before you opened your eyes and saw how many tumors there were, or felt your own throat for a tumor, you didn’t know whether you’d be the only person with a tumor, and so would have wanted to join an insurance pool! so you should honor that impulse and help me pay for my medical bills”, but then everyone else correctly answers “actually, we’re not smokers”. Where, in this analogy, smoking is being a bunch of incompetent disaster-monkeys and the tumor is impending death by AI.)
But trouble arises when the competant ones can refine P(takeover) for the various planets by thinking a little further.
Similar to how the trouble arises when you learn the result of the coin flip in a counterfactual mugging? To make it exactly analogous, imagine that the mugging is based on whether the 20th digit of pi is odd (omega didn’t know the digit at the point of making the deal) and you could just go look it up. Isn’t the situation exactly analogous and the whole problem that UDT was intended to solve?
(For those who aren’t familiar with counterfactual muggings, UDT/FDT pays in this case.)
To spell out the argument, wouldn’t everyone want to make a deal prior to thinking more? Like you don’t know whether you are the competent one yet!
Concretely, imagine that each planet could spend some time thinking and be guaranteed to determine whether their P(takeover) is 99.99999% or 0.0000001%. But, they haven’t done this yet and their current view is 50%. Everyone would ex-ante prefer an outcome in which you make the deal rather than thinking about it and then deciding whether the deal is still in their interest.
At a more basic level, let’s assume your current views on the risk after thinking about it a bunch (80-90% I think). If someone had those views on the risk and cared a lot about not having physical humans die, they would benefit from such an insurance deal! (They’d have to pay higher rates than aliens in more competent civilizations of course.)
It’s more like: people don’t enter into insurance pools against cancer with the dude who smoked his whole life and has a tumor the size of a grapefruit in his throat.
Sure, but you’d potentially want to enter the pool at the age of 10 prior to starting smoking!
To make the analogy closer to the actual case, suppose you were in a society where everyone is selfish, but every person has a 1⁄10 chance of becoming fabulously wealthy (e.g. owning a galaxy). And, if you commit as of the age of 10 to pay 1⁄1,000,000 of your resourses in the fabulously wealthy case, you can ensure that the version in the non-wealthy case gets very good health insurance. Many people would take such a deal and this deal would also be a slam dunk for the insurance pool!
(So why doesn’t this happen in human society? Well, to some extent it does. People try to get life insurance early while they are still behind the veil of ignorance. It is common in human society to prefer to make a deal prior to having some knowledge. (If people were the right type of UDT, then this wouldn’t be a problem.) As far as why people don’t enter into fully general income insurance schemes when very young, I think it is a combination of irrationality, legal issues, and adverse selection issues.)
Here’s how I think this choice looks from the perspective of a correctly-reasoning AI (having skimmed only your proposal for humanity in the future):
(EDIT: as Ryan points out, the deal he’s proposing is one where the AI is allowed to grab the universe in front of it so long as it gives humans one planet’s worth of resources, which does change the calculus and makes the deal worthwhile. I had not read the proposal carefully enough to notice this, oops, sorry. There critique below is invalid if humanity asks only for one planet. Note however that this is still a valid critique of the idea that it matters that you’re performing simulations, and that the count of those simulations matter. AFAICT, those aspects of your proposal are not doing any work.)
Note that this reasoning is independent of how many times you simulate the AI. Its reasoning is not factoring through an extra question of “where it really is”.
Suppose generously that, in the case where the AI just grabs the universe, humanity somehow winds up surviving in 1-in-a-billion Everett branches. (This seems high to me, even though my odds of success are greater than 1-in-a-billion. Conditional on us dying, my guess is you’d need to perturb more than a few dozen atoms to wrech us onto a surviving track.) Suppose that in that billionth-fraction of reality, you simulate the AI convincingly a trillion times on account of your “commitments”, reasoning that now the AI must think it’s 1000:1 likely to be one of the simulations.
The AI can nevertheless correctly calculate that it can turn a greater fraction of Tegmark IV into tiny mollecular spirals by grabbing at the universe that is apparently in front of it, rather than by taking a small reward in a small fraction of reality. Even if you gave the AI all of the resources in your billionth-fraction of reality, it wouldn’t be enough.
You could object that you wish to deal with an AI that cares about the sensory observations given to it as weighted by both degree-of-reality and count-of-instantiations instead. And sure, that’s a somewhat different scenario. In that case, your simulation proposal becomes not just a bad deal (that the AI rejects) but a threat, in the technical sense. Insofar as you’re like “yes, my plan is to make an AI with a very specific sort of preferences and then threaten it”, we could talk about how that would go.
But before we go there, it’s maybe worth checking that we’re on the same page about how, insofar as the AI cares about arranging stuff in Tegmark IV (rather than about realness-and-instantiation-weighted observations), you aren’t even making it a good offer, afaict.
Note that we don’t want the AI to hand over the reins to humans, just to avoid killing humans when possible/cheap and we’re willing to pay quite a bit for this (in the proposal, IDK if I personally think we should pay). So, it should look like a good offer for AIs who care about Tegmark IV (with a measure etc).
So, if humans execute this scheme, the AI’s options should look something like:
Don’t do anything about what humans want or keeping them alive, get the entire universe.
Keep humans alive and happy at a total cost of <1/100 million of the universe I control, get 1/million of the universe somewhere else in expectation. I also get the entire universe minus less than 1⁄100 million of the universe that I lost in expectation for a total of approximately 1 + 1/million of total universe.
(I increased the pay and the margin so we don’t have to argue about that pay being insufficient. Probably you can pay much less.)
(Let’s suppose the swap is from two universes with equal measure for simplicity.)
Do you agree that if we assume that humans execute this scheme and that the AI is confident in this scheme, it makes sense for (sensible, Tegmark-IV-caring) AIs to pay up?
I’m sympathetic to the idea that humans shouldn’t or won’t execute this scheme. (After all, we’re clearly losing out from the perspective of arranging matter in Tegmark IV and it’s unclear whether the AI will be able to predict us sufficiently well at the point of killing/not killing to get the normal acausal trade without commitments story going.) So, let’s just talk about conditioning on trying to execute the scheme for now.
The case where the AI “cares about the sensory observations given to it as weighted by both degree-of-reality and count-of-instantiations” is messier and I haven’t really analyzed this case personally. I agree that in this case, simulating it could be a threat which is unfortunate. I’d prefer to put this case aside for now, though I think we should probably be able to work out a deal with AIs which are happy to be simulated as long as they get at least 1/million of the universe or whatever[1].
Again, probably you can pay much less.
I agree that in scenarios where humanity survives in 1/X portion of reality and offers to put 1/Y portion of their universe towards the AI’s goals, then so long as 1/(X * Y) of a universe is more than it’d cost the AI to preserve the humans, the humans are at least making an offer that is worth more than nothing.
(There is then a question of how many better offers get made. My top guess is that X and Y are such that there’s not much effect here, and that the question of what happens to saved copies of human minds is more about whether aliens buy copies than about which dead or distant trade-partners make offers that the AI considers juciest.)
Let’s conservatively say that evolved life gets around 1% of the multiverse/measure and that evolved life is willing to pay 1/million of its resources in expectation to save aliens from being killed (either “selfishly” to save their own civilization via UDT/FDT supposing that AIs are good enough predictors at the relevant points or out of a common sense altruistic case).
This would be 1⁄100 million which gets you a lot.
There could be other aliens who are willing to pay a huge fraction of their resources to perform rituals on the original civilization or whatever and thus these other aliens win out in the bargaining, but I’m skeptical.
Also, at least in the upload case, it’s not clear that this is rival good as uploads can be copied for free. Of course, people might have a preference that their upload isn’t used for crazy alien rituals or whatever.
(A bunch of the cost is in saving the human in the first place. Paying for uploads to eventually get run in a reasonable way should be insanely cheap, like <<10^-25 of the overall universe or something.)
Conditional on the civilization around us flubbing the alignment problem, I’m skeptical that humanity has anything like a 1% survival rate (across any branches since, say, 12 Kya). (Haven’t thought about it a ton, but doom looks pretty overdetermined to me, in a way that’s intertwined with how recorded history has played otu.)
My guess is that the doomed/poor branches of humanity vastly outweigh the rich branches, such that the rich branches of humanity lack the resources to pay for everyone. (My rough mental estimate for this is something like: you’ve probably gotta go at least one generation back in time, and then rely on weather-pattern changes that happen to give you a population of humans that is uncharacteristically able to meet this challenge, and that’s a really really small fraction of all populations.)
Nevertheless, I don’t mind the assumption that mostly-non-human evolved life manages to grab the universe around it about 1% of the time. I’m skeptical that they’d dedicate 1/million towards the task of saving aliens from being killed in full generality, as opposed to (e.g.) focusing on their bretheren. (And I see no UDT/FDT justification for them to pay for even the particularly foolish and doomed aliens to be saved, and I’m not sure what you were aluding to there.)
So that’s two possible points of disagreement:
are the skilled branches of humanity rich enough to save us in particular (if they were the only ones trading for our souls, given that they’re also trying to trade for the souls of oodles of other doomed populations)?
are there other evolved creatures out there spending significant fractions of their wealth on whole species that are doomed, rather than concentrating their resources on creatures more similar to themselves / that branched off radically more recently? (e.g. because the multiverse is just that full of kindness, or for some alleged UDT/FDT argument that Nate has not yet understood?)
I’m not sure which of these points we disagree about. (both? presumably at least one?)
I’m not radically confident about the proposition “the multiverse is so full of kindness that something out there (probably not anything humanlike) will pay for a human-reserve”. We can hopefully at least agree that this does not deserve the description “we can bamboozle the AI into sparing our life”. That situation deserves, at best, the description “perhaps the AI will sell our mind-states to aliens”, afaict (and I acknowledge that this is a possibility, despite how we may disagree on its likelihood and on the likely motives of the relevant aliens).
Partial delta from me. I think the argument for directly paying for yourself (or your same species, or at least more similar civilizations) is indeed more clear and I think I was confused when I wrote that. (In that I was mostly thinking about the argument for paying for the same civilization but applying it more broadly.)
But, I think there is a version of the argument which probably does go through depending on how you set up UDT/FDT.
Imagine that you do UDT starting from your views prior to learning about x-risk, AI risk, etc and you care a lot about not dying. At that point, you were uncertain about how competent your civilization would be and you don’t want your civilization to die. (I’m supposing that our version of UDT/FDT isn’t logically omniscient relative to our observations which seems reasonable.) So, you’d like to enter into an insurance agreement with all the aliens in a similar epistemic state and position. So, you all agree to put at least 1/1000 of your resources on bailing out the aliens in a similar epistemic state who would have actually gone through with the agreement. Then, some of the aliens ended up being competent (sadly you were not) and thus they bail you out.
I expect this isn’t the optimal version of this scheme and you might be able to make a similar insurance deal with people who aren’t in the same epistemic state. (Though it’s easier to reason about the identical case.) And I’m not sure exactly how this all goes through. And I’m not actually advocating for people doing this scheme, IDK if it is worth the resources.
Even with your current epistemic state on x-risk (e.g. 80-90% doom) if you cared a lot about not dying you might want to make such a deal even though you have to pay out more in the case where you surprisingly win. Thus, from this vantage point UDT would follow through with a deal.
Here is a simplified version where everything is as concrete as possible:
Suppose that there are 3 planets with evolved life with equal magical-reality-fluid (and nothing else for simplicity). For simplicity, we’ll also say that these planets are in the same universe and thus the resulting civilizations will be able to causally trade with each other in the far future.
The aliens on each of these planets really don’t want to die and would be willing to pay up to 1/1000 of all their future resources to avoid dying (paying these resources in cases where they avoid takeover and successfully use the resources of the future). (Perhaps this is irrational, but let’s suppose this is endorsed on reflection.)
On each planet, the aliens all agree that P(takeover) for their planet is 50%. (And let’s suppose it is uncorrelated between planets for simplicity.)
Let’s suppose the aliens across all planets also all know this, as in, they know there are 3 planets etc.
So, the aliens would love to make a deal with each other where winning planets pay to avoid AIs killing everyone on losing planets so that they get bailed out. So, if at least one planet avoids takeover, everyone avoids dying. (Of course, if a planet would have defected and not payed out if they avoided takeover, the other aliens also wouldn’t bail them out.)
Do you buy that in this case, the aliens would like to make the deal and thus UDT from this epistemic perspective would pay out?
It seems like all the aliens are much better off with the deal from their perspective.
Now, maybe your objection is that aliens would prefer to make the deal with beings more similar to them. And thus, alien species/civilizations who are actually all incompetent just die. However, all the aliens (including us) don’t know whether we are the incompetent ones, so we’d like to make a diverse and broader trade/insurance-policy to avoid dying.
If they had literally no other options on offer, sure. But trouble arises when the competant ones can refine P(takeover) for the various planets by thinking a little further.
It’s more like: people don’t enter into insurance pools against cancer with the dude who smoked his whole life and has a tumor the size of a grapefruit in his throat. (Which isn’t to say that nobody will donate to the poor guy’s gofundme, but which is to say that he’s got to rely on charity rather than insurance).
(Perhaps the poor guy argues “but before you opened your eyes and saw how many tumors there were, or felt your own throat for a tumor, you didn’t know whether you’d be the only person with a tumor, and so would have wanted to join an insurance pool! so you should honor that impulse and help me pay for my medical bills”, but then everyone else correctly answers “actually, we’re not smokers”. Where, in this analogy, smoking is being a bunch of incompetent disaster-monkeys and the tumor is impending death by AI.)
Similar to how the trouble arises when you learn the result of the coin flip in a counterfactual mugging? To make it exactly analogous, imagine that the mugging is based on whether the 20th digit of pi is odd (omega didn’t know the digit at the point of making the deal) and you could just go look it up. Isn’t the situation exactly analogous and the whole problem that UDT was intended to solve?
(For those who aren’t familiar with counterfactual muggings, UDT/FDT pays in this case.)
To spell out the argument, wouldn’t everyone want to make a deal prior to thinking more? Like you don’t know whether you are the competent one yet!
Concretely, imagine that each planet could spend some time thinking and be guaranteed to determine whether their P(takeover) is 99.99999% or 0.0000001%. But, they haven’t done this yet and their current view is 50%. Everyone would ex-ante prefer an outcome in which you make the deal rather than thinking about it and then deciding whether the deal is still in their interest.
At a more basic level, let’s assume your current views on the risk after thinking about it a bunch (80-90% I think). If someone had those views on the risk and cared a lot about not having physical humans die, they would benefit from such an insurance deal! (They’d have to pay higher rates than aliens in more competent civilizations of course.)
Sure, but you’d potentially want to enter the pool at the age of 10 prior to starting smoking!
To make the analogy closer to the actual case, suppose you were in a society where everyone is selfish, but every person has a 1⁄10 chance of becoming fabulously wealthy (e.g. owning a galaxy). And, if you commit as of the age of 10 to pay 1⁄1,000,000 of your resourses in the fabulously wealthy case, you can ensure that the version in the non-wealthy case gets very good health insurance. Many people would take such a deal and this deal would also be a slam dunk for the insurance pool!
(So why doesn’t this happen in human society? Well, to some extent it does. People try to get life insurance early while they are still behind the veil of ignorance. It is common in human society to prefer to make a deal prior to having some knowledge. (If people were the right type of UDT, then this wouldn’t be a problem.) As far as why people don’t enter into fully general income insurance schemes when very young, I think it is a combination of irrationality, legal issues, and adverse selection issues.)