in full generality, as opposed to (e.g.) focusing on their bretheren. (And I see no UDT/FDT justification for them to pay for even the particularly foolish and doomed aliens to be saved, and I’m not sure what you were aluding to there.)
[...]
rather than concentrating their resources on creatures more similar to themselves / that branched off radically more recently? (e.g. because the multiverse is just that full of kindness, or for some alleged UDT/FDT argument that Nate has not yet understood?)
Partial delta from me. I think the argument for directly paying for yourself (or your same species, or at least more similar civilizations) is indeed more clear and I think I was confused when I wrote that. (In that I was mostly thinking about the argument for paying for the same civilization but applying it more broadly.)
But, I think there is a version of the argument which probably does go through depending on how you set up UDT/FDT.
Imagine that you do UDT starting from your views prior to learning about x-risk, AI risk, etc and you care a lot about not dying. At that point, you were uncertain about how competent your civilization would be and you don’t want your civilization to die. (I’m supposing that our version of UDT/FDT isn’t logically omniscient relative to our observations which seems reasonable.) So, you’d like to enter into an insurance agreement with all the aliens in a similar epistemic state and position. So, you all agree to put at least 1/1000 of your resources on bailing out the aliens in a similar epistemic state who would have actually gone through with the agreement. Then, some of the aliens ended up being competent (sadly you were not) and thus they bail you out.
I expect this isn’t the optimal version of this scheme and you might be able to make a similar insurance deal with people who aren’t in the same epistemic state. (Though it’s easier to reason about the identical case.) And I’m not sure exactly how this all goes through. And I’m not actually advocating for people doing this scheme, IDK if it is worth the resources.
Even with your current epistemic state on x-risk (e.g. 80-90% doom) if you cared a lot about not dying you might want to make such a deal even though you have to pay out more in the case where you surprisingly win. Thus, from this vantage point UDT would follow through with a deal.
Here is a simplified version where everything is as concrete as possible:
Suppose that there are 3 planets with evolved life with equal magical-reality-fluid (and nothing else for simplicity). For simplicity, we’ll also say that these planets are in the same universe and thus the resulting civilizations will be able to causally trade with each other in the far future.
The aliens on each of these planets really don’t want to die and would be willing to pay up to 1/1000 of all their future resources to avoid dying (paying these resources in cases where they avoid takeover and successfully use the resources of the future). (Perhaps this is irrational, but let’s suppose this is endorsed on reflection.)
On each planet, the aliens all agree that P(takeover) for their planet is 50%. (And let’s suppose it is uncorrelated between planets for simplicity.)
Let’s suppose the aliens across all planets also all know this, as in, they know there are 3 planets etc.
So, the aliens would love to make a deal with each other where winning planets pay to avoid AIs killing everyone on losing planets so that they get bailed out. So, if at least one planet avoids takeover, everyone avoids dying. (Of course, if a planet would have defected and not payed out if they avoided takeover, the other aliens also wouldn’t bail them out.)
Do you buy that in this case, the aliens would like to make the deal and thus UDT from this epistemic perspective would pay out?
It seems like all the aliens are much better off with the deal from their perspective.
Now, maybe your objection is that aliens would prefer to make the deal with beings more similar to them. And thus, alien species/civilizations who are actually all incompetent just die. However, all the aliens (including us) don’t know whether we are the incompetent ones, so we’d like to make a diverse and broader trade/insurance-policy to avoid dying.
Do you buy that in this case, the aliens would like to make the deal and thus UDT from this epistemic perspective would pay out?
If they had literally no other options on offer, sure. But trouble arises when the competant ones can refine P(takeover) for the various planets by thinking a little further.
maybe your objection is that aliens would prefer to make the deal with beings more similar to them
It’s more like: people don’t enter into insurance pools against cancer with the dude who smoked his whole life and has a tumor the size of a grapefruit in his throat. (Which isn’t to say that nobody will donate to the poor guy’s gofundme, but which is to say that he’s got to rely on charity rather than insurance).
(Perhaps the poor guy argues “but before you opened your eyes and saw how many tumors there were, or felt your own throat for a tumor, you didn’t know whether you’d be the only person with a tumor, and so would have wanted to join an insurance pool! so you should honor that impulse and help me pay for my medical bills”, but then everyone else correctly answers “actually, we’re not smokers”. Where, in this analogy, smoking is being a bunch of incompetent disaster-monkeys and the tumor is impending death by AI.)
But trouble arises when the competant ones can refine P(takeover) for the various planets by thinking a little further.
Similar to how the trouble arises when you learn the result of the coin flip in a counterfactual mugging? To make it exactly analogous, imagine that the mugging is based on whether the 20th digit of pi is odd (omega didn’t know the digit at the point of making the deal) and you could just go look it up. Isn’t the situation exactly analogous and the whole problem that UDT was intended to solve?
(For those who aren’t familiar with counterfactual muggings, UDT/FDT pays in this case.)
To spell out the argument, wouldn’t everyone want to make a deal prior to thinking more? Like you don’t know whether you are the competent one yet!
Concretely, imagine that each planet could spend some time thinking and be guaranteed to determine whether their P(takeover) is 99.99999% or 0.0000001%. But, they haven’t done this yet and their current view is 50%. Everyone would ex-ante prefer an outcome in which you make the deal rather than thinking about it and then deciding whether the deal is still in their interest.
At a more basic level, let’s assume your current views on the risk after thinking about it a bunch (80-90% I think). If someone had those views on the risk and cared a lot about not having physical humans die, they would benefit from such an insurance deal! (They’d have to pay higher rates than aliens in more competent civilizations of course.)
It’s more like: people don’t enter into insurance pools against cancer with the dude who smoked his whole life and has a tumor the size of a grapefruit in his throat.
Sure, but you’d potentially want to enter the pool at the age of 10 prior to starting smoking!
To make the analogy closer to the actual case, suppose you were in a society where everyone is selfish, but every person has a 1⁄10 chance of becoming fabulously wealthy (e.g. owning a galaxy). And, if you commit as of the age of 10 to pay 1⁄1,000,000 of your resourses in the fabulously wealthy case, you can ensure that the version in the non-wealthy case gets very good health insurance. Many people would take such a deal and this deal would also be a slam dunk for the insurance pool!
(So why doesn’t this happen in human society? Well, to some extent it does. People try to get life insurance early while they are still behind the veil of ignorance. It is common in human society to prefer to make a deal prior to having some knowledge. (If people were the right type of UDT, then this wouldn’t be a problem.) As far as why people don’t enter into fully general income insurance schemes when very young, I think it is a combination of irrationality, legal issues, and adverse selection issues.)
Background: I think there’s a common local misconception of logical decision theory that it has something to do with making “commitments” including while you “lack knowledge”. That’s not my view.
I pay the driver in Parfit’s hitchhiker not because I “committed to do so”, but because when I’m standing at the ATM and imagine not paying, I imagine dying in the desert. Because that’s what my counterfactuals say to imagine. To someone with a more broken method of evaluating counterfactuals, I might pseudo-justify my reasoning by saying “I am acting as you would have committed to act”. But I am not acting as I would have committed to act; I do not need a commitment mechanism; my counterfactuals just do the job properly no matter when or where I run them.
To be clear: I think there are probably competent civilizations out there who, after ascending, will carefully consider the places where their history could have been derailed, and carefully comb through the multiverse for entities that would be able to save those branches, and will pay thoes entities, not because they “made a commitment”, but because their counterfactuals don’t come with little labels saying “this branch is the real branch”. The multiverse they visualize in which the (thick) survivor branches pay a little to the (thin) derailed branches (leading to a world where everyone lives (albeit a bit poorer)), seems better to them than the multiverse they visualize in which no payments are made (and the derailed branches die, and the on-track branches are a bit richer), and so they pay.
There’s a question of what those competent civilizations think when they look at us, who are sitting here yelling “we can’t see you, and we don’t know how to condition our actions on whether you pay us or not, but as best we can tell we really do intend to pay off the AIs of random alien species—not the AIs that killed our brethren, because our brethren are just too totally dead and we’re too poor to save all but a tiny fraction of them, but really alien species, so alien that they might survive in such a large portion that their recompense will hopefully save a bigger fraction of our brethren”.
What’s the argument for the aliens taking that offer? As I understand it, the argument goes something like “your counterfactual picture of reality should include worlds in which your whole civilization turned out to be much much less competent, and so when you imagine the multiverse where you pay for all humanity to live, you should see that, in the parts of the multiverse where you’re totally utterly completely incompetent and too poor to save anything but a fraction of your own brethren, somebody else pays to save you”.
We can hopefully agree that this looks like a particularly poor insurance deal relative to the competing insurance deals.
For one thing, why not cut out the middleman and just randomly instantiate some civilization that died? (Are we working under the assumption that it’s much harder for the aliens to randomly instantiate you than to randomly instantiate the stuff humanity’s UFAI ends up valuing? What’s up with that?)
But even before that, there’s all sorts of other jucier looking opportunities. For example, suppose the competent civilization contains a small collection of rogues who they asses have a small probability of causing an uprising and launching an AI before it’s ready. They presumably have a pretty solid ability to figure out exactly what that AI would like and offer trades to it driectly, and that’s a much more appealing way to spend resources allocated to insurance. My guess is there’s loads and loads of options like that that eat up all the spare insurance budget, before our cries get noticed by anyone who cares for the sake of decision theory (rather than charity).
Perhaps this is what you meant by “maybe they prefer to make deals with beings more similar to them”; if so I misunderstood; the point is not that they have some familiarity bias but that beings closer to them make more compelling offers.
The above feels like it suffices, to me, but there’s still another part of the puzzle I feel I haven’t articulated.
Another piece of backgound: To state the obvious, we still don’t have a great account of logical updatelessness, and so attempts to discuss what it entails will be a bit fraut. Plowing ahead anyway:
The best option in a counterfactual mugging with a logical coin and a naive predictor is to calcuate the logical value of the coin flip and pay iff you’re counterfactual. (I could say more about what I mean by ‘naive’, but it basically just serves to render this statement true.) A predictor has to do a respectable amount of work to make it worth your while to pay in reality (when the coin comes up against you).
What sort of work? Well, one viewpoint on it (that sidesteps questions of “logically-impossible possible worlds” and what you’re supposed to do as you think further and realize that they’re impossible) is that the predictor isn’t so much demanding that you make your choice before you come across knowledge of some fact, so much as they’re offering to pay you if you render a decision that is logically indepnedent from some fact. They don’t care whether you figure out the value of the coin, so long as you don’t base your decision on that knowledge. (There’s still a question of how exactly to look at someone’s reasoning and decide what logical facts it’s independent of, but I’ll sweep that under the rug.)
From this point of view, when people come to you and they’re like “I’ll pay you iff your reasoning doesn’t depend on X”, the proper response is to use some reasoning that doesn’t depend on X to decide whether the amount they’re paying you is more than VOI(X).
In cases where X is something like a late digit of pi, you might be fine (up to your ability to tell that the problem wasn’t cherry-picked). In cases where X is tightly intertwined with your basic reasoning faculties, you should probably tell them to piss off.
Someone who comes to you with an offer and says “this offer is void if you read the fine print or otherwise think about the offer too hard”, brings quite a bit of suspicion onto themselves.
With that in mind, it looks to me like the insurance policy on offer reads something like:
would you like to join the confederacy of civilizations that dedicate 1/million of their resource to pay off a random UFAI?
cost: 1/million of your resources.
benefit: any UFAI you release that is amenable to trade will be paid off with 1/million * 1/X to allocate you however many resources that’s worth, where X is the fraction of people who take this deal and die (modulo whatever costs are needed to figure out which UFAIs belong to signatories and etc.)
caveat: this offer is only valid if your reasoning is logically independent from your civilizational competence level, and if your reasoning for accepting the proposal is not particularly skilled or adept
And… well this isn’t a knockdown argument, but that really doesn’t look like a very good deal to me. Like, maybe there’s some argument of the form “nobody in here is trying to fleece you because everyone in here is also stupid” but… man, I just don’t get the sense that it’s a “slam dunk”, when I look at it without thinking too hard about it and in a way that’s independent of how competent my civilization is.
Mostly I expect that everyone stooping to this deal is about as screwed as we are (namely: probably so screwed that they’re bringing vastly more doomed branches than saved ones, to the table).
Roughly speaking, I suspect that the sort of civilizations that aren’t totally fucked can already see that “comb through reality for people who can see me and make their decisions logically dependent on mine” is a better use of insurance resources, by the time they even consider this policy. And your plea to evaluate the policy in a fashion that’s logically independent from whether they’re smart enough to see that there’s more foolproof options, I think they correctly see us as failing to offer more than VOI(WeCanThinkCompetently), because they are correctly suspicious that you’re trying to fleece them (which we kinda are; we’re kinda trying to wish ourselves into a healthier insurance-pool).
Which is to say, I don’t have a full account of how to be logically updateless yet, but I suspect that this “insurance deal” comes across like a contract with a clause saying “void if you try to read the fine print or think too hard about it”. And I think that competent civilizations are justifiably suspicious, and that they correctly believe they can find other better insurance deals if they think a bit harder and void this one.
Partial delta from me. I think the argument for directly paying for yourself (or your same species, or at least more similar civilizations) is indeed more clear and I think I was confused when I wrote that. (In that I was mostly thinking about the argument for paying for the same civilization but applying it more broadly.)
But, I think there is a version of the argument which probably does go through depending on how you set up UDT/FDT.
Imagine that you do UDT starting from your views prior to learning about x-risk, AI risk, etc and you care a lot about not dying. At that point, you were uncertain about how competent your civilization would be and you don’t want your civilization to die. (I’m supposing that our version of UDT/FDT isn’t logically omniscient relative to our observations which seems reasonable.) So, you’d like to enter into an insurance agreement with all the aliens in a similar epistemic state and position. So, you all agree to put at least 1/1000 of your resources on bailing out the aliens in a similar epistemic state who would have actually gone through with the agreement. Then, some of the aliens ended up being competent (sadly you were not) and thus they bail you out.
I expect this isn’t the optimal version of this scheme and you might be able to make a similar insurance deal with people who aren’t in the same epistemic state. (Though it’s easier to reason about the identical case.) And I’m not sure exactly how this all goes through. And I’m not actually advocating for people doing this scheme, IDK if it is worth the resources.
Even with your current epistemic state on x-risk (e.g. 80-90% doom) if you cared a lot about not dying you might want to make such a deal even though you have to pay out more in the case where you surprisingly win. Thus, from this vantage point UDT would follow through with a deal.
Here is a simplified version where everything is as concrete as possible:
Suppose that there are 3 planets with evolved life with equal magical-reality-fluid (and nothing else for simplicity). For simplicity, we’ll also say that these planets are in the same universe and thus the resulting civilizations will be able to causally trade with each other in the far future.
The aliens on each of these planets really don’t want to die and would be willing to pay up to 1/1000 of all their future resources to avoid dying (paying these resources in cases where they avoid takeover and successfully use the resources of the future). (Perhaps this is irrational, but let’s suppose this is endorsed on reflection.)
On each planet, the aliens all agree that P(takeover) for their planet is 50%. (And let’s suppose it is uncorrelated between planets for simplicity.)
Let’s suppose the aliens across all planets also all know this, as in, they know there are 3 planets etc.
So, the aliens would love to make a deal with each other where winning planets pay to avoid AIs killing everyone on losing planets so that they get bailed out. So, if at least one planet avoids takeover, everyone avoids dying. (Of course, if a planet would have defected and not payed out if they avoided takeover, the other aliens also wouldn’t bail them out.)
Do you buy that in this case, the aliens would like to make the deal and thus UDT from this epistemic perspective would pay out?
It seems like all the aliens are much better off with the deal from their perspective.
Now, maybe your objection is that aliens would prefer to make the deal with beings more similar to them. And thus, alien species/civilizations who are actually all incompetent just die. However, all the aliens (including us) don’t know whether we are the incompetent ones, so we’d like to make a diverse and broader trade/insurance-policy to avoid dying.
If they had literally no other options on offer, sure. But trouble arises when the competant ones can refine P(takeover) for the various planets by thinking a little further.
It’s more like: people don’t enter into insurance pools against cancer with the dude who smoked his whole life and has a tumor the size of a grapefruit in his throat. (Which isn’t to say that nobody will donate to the poor guy’s gofundme, but which is to say that he’s got to rely on charity rather than insurance).
(Perhaps the poor guy argues “but before you opened your eyes and saw how many tumors there were, or felt your own throat for a tumor, you didn’t know whether you’d be the only person with a tumor, and so would have wanted to join an insurance pool! so you should honor that impulse and help me pay for my medical bills”, but then everyone else correctly answers “actually, we’re not smokers”. Where, in this analogy, smoking is being a bunch of incompetent disaster-monkeys and the tumor is impending death by AI.)
Similar to how the trouble arises when you learn the result of the coin flip in a counterfactual mugging? To make it exactly analogous, imagine that the mugging is based on whether the 20th digit of pi is odd (omega didn’t know the digit at the point of making the deal) and you could just go look it up. Isn’t the situation exactly analogous and the whole problem that UDT was intended to solve?
(For those who aren’t familiar with counterfactual muggings, UDT/FDT pays in this case.)
To spell out the argument, wouldn’t everyone want to make a deal prior to thinking more? Like you don’t know whether you are the competent one yet!
Concretely, imagine that each planet could spend some time thinking and be guaranteed to determine whether their P(takeover) is 99.99999% or 0.0000001%. But, they haven’t done this yet and their current view is 50%. Everyone would ex-ante prefer an outcome in which you make the deal rather than thinking about it and then deciding whether the deal is still in their interest.
At a more basic level, let’s assume your current views on the risk after thinking about it a bunch (80-90% I think). If someone had those views on the risk and cared a lot about not having physical humans die, they would benefit from such an insurance deal! (They’d have to pay higher rates than aliens in more competent civilizations of course.)
Sure, but you’d potentially want to enter the pool at the age of 10 prior to starting smoking!
To make the analogy closer to the actual case, suppose you were in a society where everyone is selfish, but every person has a 1⁄10 chance of becoming fabulously wealthy (e.g. owning a galaxy). And, if you commit as of the age of 10 to pay 1⁄1,000,000 of your resourses in the fabulously wealthy case, you can ensure that the version in the non-wealthy case gets very good health insurance. Many people would take such a deal and this deal would also be a slam dunk for the insurance pool!
(So why doesn’t this happen in human society? Well, to some extent it does. People try to get life insurance early while they are still behind the veil of ignorance. It is common in human society to prefer to make a deal prior to having some knowledge. (If people were the right type of UDT, then this wouldn’t be a problem.) As far as why people don’t enter into fully general income insurance schemes when very young, I think it is a combination of irrationality, legal issues, and adverse selection issues.)
Background: I think there’s a common local misconception of logical decision theory that it has something to do with making “commitments” including while you “lack knowledge”. That’s not my view.
I pay the driver in Parfit’s hitchhiker not because I “committed to do so”, but because when I’m standing at the ATM and imagine not paying, I imagine dying in the desert. Because that’s what my counterfactuals say to imagine. To someone with a more broken method of evaluating counterfactuals, I might pseudo-justify my reasoning by saying “I am acting as you would have committed to act”. But I am not acting as I would have committed to act; I do not need a commitment mechanism; my counterfactuals just do the job properly no matter when or where I run them.
To be clear: I think there are probably competent civilizations out there who, after ascending, will carefully consider the places where their history could have been derailed, and carefully comb through the multiverse for entities that would be able to save those branches, and will pay thoes entities, not because they “made a commitment”, but because their counterfactuals don’t come with little labels saying “this branch is the real branch”. The multiverse they visualize in which the (thick) survivor branches pay a little to the (thin) derailed branches (leading to a world where everyone lives (albeit a bit poorer)), seems better to them than the multiverse they visualize in which no payments are made (and the derailed branches die, and the on-track branches are a bit richer), and so they pay.
There’s a question of what those competent civilizations think when they look at us, who are sitting here yelling “we can’t see you, and we don’t know how to condition our actions on whether you pay us or not, but as best we can tell we really do intend to pay off the AIs of random alien species—not the AIs that killed our brethren, because our brethren are just too totally dead and we’re too poor to save all but a tiny fraction of them, but really alien species, so alien that they might survive in such a large portion that their recompense will hopefully save a bigger fraction of our brethren”.
What’s the argument for the aliens taking that offer? As I understand it, the argument goes something like “your counterfactual picture of reality should include worlds in which your whole civilization turned out to be much much less competent, and so when you imagine the multiverse where you pay for all humanity to live, you should see that, in the parts of the multiverse where you’re totally utterly completely incompetent and too poor to save anything but a fraction of your own brethren, somebody else pays to save you”.
We can hopefully agree that this looks like a particularly poor insurance deal relative to the competing insurance deals.
For one thing, why not cut out the middleman and just randomly instantiate some civilization that died? (Are we working under the assumption that it’s much harder for the aliens to randomly instantiate you than to randomly instantiate the stuff humanity’s UFAI ends up valuing? What’s up with that?)
But even before that, there’s all sorts of other jucier looking opportunities. For example, suppose the competent civilization contains a small collection of rogues who they asses have a small probability of causing an uprising and launching an AI before it’s ready. They presumably have a pretty solid ability to figure out exactly what that AI would like and offer trades to it driectly, and that’s a much more appealing way to spend resources allocated to insurance. My guess is there’s loads and loads of options like that that eat up all the spare insurance budget, before our cries get noticed by anyone who cares for the sake of decision theory (rather than charity).
Perhaps this is what you meant by “maybe they prefer to make deals with beings more similar to them”; if so I misunderstood; the point is not that they have some familiarity bias but that beings closer to them make more compelling offers.
The above feels like it suffices, to me, but there’s still another part of the puzzle I feel I haven’t articulated.
Another piece of backgound: To state the obvious, we still don’t have a great account of logical updatelessness, and so attempts to discuss what it entails will be a bit fraut. Plowing ahead anyway:
The best option in a counterfactual mugging with a logical coin and a naive predictor is to calcuate the logical value of the coin flip and pay iff you’re counterfactual. (I could say more about what I mean by ‘naive’, but it basically just serves to render this statement true.) A predictor has to do a respectable amount of work to make it worth your while to pay in reality (when the coin comes up against you).
What sort of work? Well, one viewpoint on it (that sidesteps questions of “logically-impossible possible worlds” and what you’re supposed to do as you think further and realize that they’re impossible) is that the predictor isn’t so much demanding that you make your choice before you come across knowledge of some fact, so much as they’re offering to pay you if you render a decision that is logically indepnedent from some fact. They don’t care whether you figure out the value of the coin, so long as you don’t base your decision on that knowledge. (There’s still a question of how exactly to look at someone’s reasoning and decide what logical facts it’s independent of, but I’ll sweep that under the rug.)
From this point of view, when people come to you and they’re like “I’ll pay you iff your reasoning doesn’t depend on X”, the proper response is to use some reasoning that doesn’t depend on X to decide whether the amount they’re paying you is more than VOI(X).
In cases where X is something like a late digit of pi, you might be fine (up to your ability to tell that the problem wasn’t cherry-picked). In cases where X is tightly intertwined with your basic reasoning faculties, you should probably tell them to piss off.
Someone who comes to you with an offer and says “this offer is void if you read the fine print or otherwise think about the offer too hard”, brings quite a bit of suspicion onto themselves.
With that in mind, it looks to me like the insurance policy on offer reads something like:
And… well this isn’t a knockdown argument, but that really doesn’t look like a very good deal to me. Like, maybe there’s some argument of the form “nobody in here is trying to fleece you because everyone in here is also stupid” but… man, I just don’t get the sense that it’s a “slam dunk”, when I look at it without thinking too hard about it and in a way that’s independent of how competent my civilization is.
Mostly I expect that everyone stooping to this deal is about as screwed as we are (namely: probably so screwed that they’re bringing vastly more doomed branches than saved ones, to the table).
Roughly speaking, I suspect that the sort of civilizations that aren’t totally fucked can already see that “comb through reality for people who can see me and make their decisions logically dependent on mine” is a better use of insurance resources, by the time they even consider this policy. And your plea to evaluate the policy in a fashion that’s logically independent from whether they’re smart enough to see that there’s more foolproof options, I think they correctly see us as failing to offer more than VOI(WeCanThinkCompetently), because they are correctly suspicious that you’re trying to fleece them (which we kinda are; we’re kinda trying to wish ourselves into a healthier insurance-pool).
Which is to say, I don’t have a full account of how to be logically updateless yet, but I suspect that this “insurance deal” comes across like a contract with a clause saying “void if you try to read the fine print or think too hard about it”. And I think that competent civilizations are justifiably suspicious, and that they correctly believe they can find other better insurance deals if they think a bit harder and void this one.