I haven’t strongly considered my prior on being able to save 3^^^3 people (more on this to follow). But regardless of what that prior is, if approached by somebody claiming to be a Matrix Lord who claims he can save 3^^^3 people, I’m not only faced with the problem of whether I ought to pay him the $5 - I’m also faced with the question of whether I ought to walk over to the next beggar on the street, and pay him $0.01 to save 3^^^3 people. Is this person 500 times more likely to be able to save 3^^^3 people? From the outset, not really. And giving money to random people has no prior probability of being more likely to save lives than anything else.
Now suppose that the said “Matrix Lord” opens the sky, splits the Red Sea, demonstrates his duplicator box on some fish and, sure, creates a humanoid Patronus. Now do I have more reason to believe that he is a Time Lord? Perhaps. Do I have reason to think that he will save 3^^^3 lives if I give him $5? I don’t see convincing reason to believe so, but I don’t see either view as problematic.
Obviously, once you’re not taking Hanson’s approach, there’s no problem with believing you’ve made a major discovery that can save an arbitrarily large number of lives.
But here’s where I noticed a bit of a problem in your analogy: In the dark matter case you say “”if these equations are actually true, then our descendants will be able to exploit dark energy to do computations, and according to my back-of-the-envelope calculations here, we’d be able to create around a googolplex people that way.”
Well, obviously the odds here of creating exactly a googolplex people is no greater than one in a googolplex. Why? Because those back of the hand calculations are going to get us (at best say) an interval from 0.5 x 10^(10^100) to 2 x 10^(10^100) - an interval containing more than a googolplex distinct integers. Hence, the odds of any specific one will be very low, but the sum might be very high. (This is simply worth contrasting with your single integer saved of the above case, where presumably your probabilities of saving 3^^^3 + 1 people are no higher than they were before.)
Here’s the main problem I have with your solution:
“But if I actually see strong evidence for something I previously thought was super-improbable, I don’t just do a Bayesian update, I should also question whether I was right to assign such a tiny probability in the first place—whether it was really as complex, or unnatural, as I thought. In real life, you are not ever supposed to have a prior improbability of 10^-100 for some fact distinguished enough to be written down, and yet encounter strong evidence, say 10^10 to 1, that the thing has actually happened.”
Sure you do. As you pointed out, dice rolls. The sequence of rolls in a game of Risk will do this for you, and you have strong reason to believe that you played a game of Risk and the dice landed as they did.
We do probability estimates because we lack information. Your example of a mathematical theorem is a good one: The Theorem X is true or false from the get-go. But whenever you give me new information, even if that information is framed in the form of a question, it makes sense for me to do a Bayesian update. That’s why a lot of so-called knowledge paradoxes are silly: If you ask me if I know who the president is, I can answer with 99%+ probability that it’s Obama, if you ask me whether Obama is still breathing, I have to do an update based on my consideration of what prompted the question. I’m not committing a fallacy by saying 95%, I’m doing a Bayesian update, as I should.
You’ll often find yourself updating your probabilities based on the knowledge that you were completely incorrect about something (even something mathematical) to begin with. That doesn’t mean you were wrong to assign the initial probabilities: You were assigning them based on your knowledge at the time. That’s how you assign probabilities.
In your case, you’re not even updating on an “unknown unknown”—that is, something you failed to consider even as a possibility—though that’s the reason you put all probabilities at less than 100%, because your knowledge is limited. You’re updating on something you considered before. And I see absolutely no reason to label this a special non-Bayesian type of update that somehow dodges the problem. I could be missing something, but I don’t see a coherent argument there.
As an aside, the repeated references to how people misunderstood previous posts are distracting to say the least. Couldn’t you just include a single link to Aaronson’s Large Numbers paper (or anything on up-arrow notation, I mention Aaronson’s paper because it’s fun)? After all, if you can’t understand tetration (and up), you’re not going to understand the article to begin with.
Now suppose that the said “Matrix Lord” opens the sky, splits the Red Sea, demonstrates his duplicator box on some fish and, sure, creates a humanoid Patronus. Now do I have more reason to believe that he is a Time Lord? Perhaps. Do I have reason to think that he will save 3^^^3 lives if I give him $5? I don’t see convincing reason to believe so, but I don’t see either view as problematic.
Honestly, at this point, I would strongly update in the direction that I am being deceived in some manner. Possibly I am dreaming, or drugged, of the person in front of me has some sort of perception-control device. I do not see any reason why someone who could open the sky, split the Red Sea, and so on, would need $5; and if he did, why not make it himself? Or sell the fish?
The only reasons I can imagine for a genuine Matrix Lord pulling this on me are very bad for me. Either he’s a sadist who likes people to suffer—in which case I’m doomed no matter what I do—or there’s something that he’s not telling me (perhaps doing what he says once surrenders my free will, allowing him to control me forever?), which implies that he believes that I would reject his demand if I knew the truth behind it, which strongly prompts me to reject his demand.
Or he’s insane, following no discernable rules, in which case the only thing to do is to try to evade notice (something I’ve clearly already failed at).
Either he’s a sadist who likes people to suffer—in which case I’m doomed no matter what I do—or there’s something that he’s not telling me (perhaps doing what he says once surrenders my free will, allowing him to control me forever?), which implies that he believes that I would reject his demand if I knew the truth behind it, which strongly prompts me to reject his demand.
That your universe is controlled by a sadist doesn’t suggest that every possible action you could do is equivalent. Maybe all your possible fates are miserable, but some are far more miserable than others. More importantly, a being might be sadistic in some respects/situations but not in others.
I also have to assign a very, very low prior to anyone’s being able to figure out in 5 minutes what the Matrix Lord’s exact motivations are. Your options are too simplistic even to describe minds of human-level complexity, much less ones of the complexity required to design or oversee physics-breakingly large simulations.
I think indifference to our preferences (except as incidental to some other goal, e.g., paperclipping) is more likely than either sadism or beneficence. Only very small portions of the space of values focus on human-style suffering or joy. Even in hypotheticals that seem designed to play with human moral intuitions. Eliezer’s decision theory conference explanation makes as much sense as any.
That your universe is controlled by a sadist doesn’t suggest that every possible action you could do is equivalent. Maybe all your possible fates are miserable, but some are far more miserable than others.
You are right. However, I can see no way to decide which course of action is best (or least miserable). My own decision process becomes questionable in such a situation; I can’t imagine any strategy that is convincingly better than taking random actions.
When I say “doomed no matter what I do”, I do not mean doomed with certainty. I mean that I have a high probability of doom, for any given action, and I cannot find a way to minimise that probability through my own actions.
I think indifference to our preferences (except as incidental to some other goal, e.g., paperclipping) is more likely than either sadism or beneficence.
Thinking about this, I think that you are right. I still consider sadism more likely than beneficence, but I had been setting the prior for indifference too low. This implies that the Matrix Lord has preferences, but these preferences are unknown and possibly unknowable (perhaps he wants to maximise slood).
...
This make the question of which action to best take even more difficult to answer. I do not know anything about slood; I cannot, because it only exists outside the Matrix. The only source of information from outside the Matrix is the Matrix Lord. This implies that, before reaching any decision, I should spend a long time interviewing the Matrix Lord, in an attempt to better be able to model him.
However, I can see no way to decide which course of action is best (or least miserable). My own decision process becomes questionable in such a situation; I can’t imagine any strategy that is convincingly better than taking random actions.
Well, this Matrix Lord seems very interested in decision theory and utilitarianism. Sadistic or not, I expect such a being to respond more favorably to attempts to take the dilemmas he raised seriously than to an epistemic meltdown. Taking the guy at his word and trying to reason your way through the problem is likely to give him more useful data than attempts to rebel or go crazy, and if you’re useful then it’s less likely that he’ll punish you or pull the plug on your universe’s simulation.
It seems reasonably likely that this will lead to a response of ”...alright, I’ve got the data that I wanted, no need to keep this simulation running any longer...” and then pulling the plug on my universe. While it is true that this strategy is likely to lead to a happier Matrix Lord (especially if the data that I give him coincides with the data he expects), I’m not convinced that it leads to a longer existence for my universe.
That may be true too. It depends on the priors we have for generic superhuman agents’ reasons for keeping a simulation running (e.g., having some other science experiments planned, wanting to reward you for providing data...) vs. for shutting it down (e.g., vindictiveness, energy conservation, being interested only in one data point per simulation...).
We do have some data to work with here, since we have experience with the differential effects of power, intelligence, curiosity, etc. among humans. That data is only weakly applicable to such an exotic agent, but it does play a role, so our uncertainty isn’t absolute. My main point was that unusual situations like this don’t call for complete decision-theoretic despair; we still need to make choices, and we can still do so reasonably, though our confidence that the best decision is also a winning decision is greatly diminished.
Well, if I’m going to free-form speculate about the scenario, rather than use it to explore the question it was introduced to explore, the most likely explanation that occurs to me is that the entity is doing the Matrix Lord equivalent of free-form speculating… that is, it’s wondering “what would humans do, given this choice and that information?” And, it being a Matrix Lord, its act of wondering creates a human mind (in this case, mine) and gives it that choice and information.
Which makes it likely that I haven’t actually lived through most of the life I remember, and that I won’t continue to exist much longer than this interaction, and that most of what I think is in the world around me doesn’t actually exist.
That said, I’m not sure what use free-form speculating about such bizarre and underspecified scenarios really is, though I’ll admit it’s kind of fun.
That said, I’m not sure what use free-form speculating about such bizarre and underspecified scenarios really is, though I’ll admit it’s kind of fun.
It’s kind of fun. Isn’t that reason enough?
Looking at the original question—i.e. how to handle very large utilities with very small probability—I find that I have a mental safety net there. The safety net says that the situation is a lie. It does not matter how much utility is claimed, because anyone can state any arbitrarily large number, and a number has been chosen (in this case, by the Matrix Lord) in a specific attempt to overwhelm my utility function. The small probability is chosen (a) because I would not believe a larger probability and (b) so that I have no recourse when it fails to happen.
I am reluctant to fiddle with my mental safety nets because, well, they’re safety nets—they’re there for a reason. And in this case, the reason is that such a fantastically unlikely event is unlikely enough that it’s not likely to happen ever, to anyone. Not even once in the whole history of the universe. If I (out of all the hundreds of billions of people in all of history) do ever run across such a situation, then it’s so incredibly overwhelmingly more likely that I am being deceived that I’m far more likely to gain by immediately jumping to the conclusion of ‘deceit’ than by assuming that there’s any chance of this being true.
In real life, you are not ever supposed to have a prior improbability of 10^-100 for some fact distinguished enough to be written down, and yet encounter strong evidence, say 10^10 to 1, that the thing has actually happened.
Sure you do. As you pointed out, dice rolls. The sequence of rolls in a game of Risk
Those aren’t “distinguished enough to be written down” before the game is played. I’ll edit to make this slightly clearer hopefully.
A few thoughts:
I haven’t strongly considered my prior on being able to save 3^^^3 people (more on this to follow). But regardless of what that prior is, if approached by somebody claiming to be a Matrix Lord who claims he can save 3^^^3 people, I’m not only faced with the problem of whether I ought to pay him the $5 - I’m also faced with the question of whether I ought to walk over to the next beggar on the street, and pay him $0.01 to save 3^^^3 people. Is this person 500 times more likely to be able to save 3^^^3 people? From the outset, not really. And giving money to random people has no prior probability of being more likely to save lives than anything else.
Now suppose that the said “Matrix Lord” opens the sky, splits the Red Sea, demonstrates his duplicator box on some fish and, sure, creates a humanoid Patronus. Now do I have more reason to believe that he is a Time Lord? Perhaps. Do I have reason to think that he will save 3^^^3 lives if I give him $5? I don’t see convincing reason to believe so, but I don’t see either view as problematic.
Obviously, once you’re not taking Hanson’s approach, there’s no problem with believing you’ve made a major discovery that can save an arbitrarily large number of lives.
But here’s where I noticed a bit of a problem in your analogy: In the dark matter case you say “”if these equations are actually true, then our descendants will be able to exploit dark energy to do computations, and according to my back-of-the-envelope calculations here, we’d be able to create around a googolplex people that way.”
Well, obviously the odds here of creating exactly a googolplex people is no greater than one in a googolplex. Why? Because those back of the hand calculations are going to get us (at best say) an interval from 0.5 x 10^(10^100) to 2 x 10^(10^100) - an interval containing more than a googolplex distinct integers. Hence, the odds of any specific one will be very low, but the sum might be very high. (This is simply worth contrasting with your single integer saved of the above case, where presumably your probabilities of saving 3^^^3 + 1 people are no higher than they were before.)
Here’s the main problem I have with your solution:
“But if I actually see strong evidence for something I previously thought was super-improbable, I don’t just do a Bayesian update, I should also question whether I was right to assign such a tiny probability in the first place—whether it was really as complex, or unnatural, as I thought. In real life, you are not ever supposed to have a prior improbability of 10^-100 for some fact distinguished enough to be written down, and yet encounter strong evidence, say 10^10 to 1, that the thing has actually happened.”
Sure you do. As you pointed out, dice rolls. The sequence of rolls in a game of Risk will do this for you, and you have strong reason to believe that you played a game of Risk and the dice landed as they did.
We do probability estimates because we lack information. Your example of a mathematical theorem is a good one: The Theorem X is true or false from the get-go. But whenever you give me new information, even if that information is framed in the form of a question, it makes sense for me to do a Bayesian update. That’s why a lot of so-called knowledge paradoxes are silly: If you ask me if I know who the president is, I can answer with 99%+ probability that it’s Obama, if you ask me whether Obama is still breathing, I have to do an update based on my consideration of what prompted the question. I’m not committing a fallacy by saying 95%, I’m doing a Bayesian update, as I should.
You’ll often find yourself updating your probabilities based on the knowledge that you were completely incorrect about something (even something mathematical) to begin with. That doesn’t mean you were wrong to assign the initial probabilities: You were assigning them based on your knowledge at the time. That’s how you assign probabilities.
In your case, you’re not even updating on an “unknown unknown”—that is, something you failed to consider even as a possibility—though that’s the reason you put all probabilities at less than 100%, because your knowledge is limited. You’re updating on something you considered before. And I see absolutely no reason to label this a special non-Bayesian type of update that somehow dodges the problem. I could be missing something, but I don’t see a coherent argument there.
As an aside, the repeated references to how people misunderstood previous posts are distracting to say the least. Couldn’t you just include a single link to Aaronson’s Large Numbers paper (or anything on up-arrow notation, I mention Aaronson’s paper because it’s fun)? After all, if you can’t understand tetration (and up), you’re not going to understand the article to begin with.
Honestly, at this point, I would strongly update in the direction that I am being deceived in some manner. Possibly I am dreaming, or drugged, of the person in front of me has some sort of perception-control device. I do not see any reason why someone who could open the sky, split the Red Sea, and so on, would need $5; and if he did, why not make it himself? Or sell the fish?
The only reasons I can imagine for a genuine Matrix Lord pulling this on me are very bad for me. Either he’s a sadist who likes people to suffer—in which case I’m doomed no matter what I do—or there’s something that he’s not telling me (perhaps doing what he says once surrenders my free will, allowing him to control me forever?), which implies that he believes that I would reject his demand if I knew the truth behind it, which strongly prompts me to reject his demand.
Or he’s insane, following no discernable rules, in which case the only thing to do is to try to evade notice (something I’ve clearly already failed at).
That your universe is controlled by a sadist doesn’t suggest that every possible action you could do is equivalent. Maybe all your possible fates are miserable, but some are far more miserable than others. More importantly, a being might be sadistic in some respects/situations but not in others.
I also have to assign a very, very low prior to anyone’s being able to figure out in 5 minutes what the Matrix Lord’s exact motivations are. Your options are too simplistic even to describe minds of human-level complexity, much less ones of the complexity required to design or oversee physics-breakingly large simulations.
I think indifference to our preferences (except as incidental to some other goal, e.g., paperclipping) is more likely than either sadism or beneficence. Only very small portions of the space of values focus on human-style suffering or joy. Even in hypotheticals that seem designed to play with human moral intuitions. Eliezer’s decision theory conference explanation makes as much sense as any.
You are right. However, I can see no way to decide which course of action is best (or least miserable). My own decision process becomes questionable in such a situation; I can’t imagine any strategy that is convincingly better than taking random actions.
When I say “doomed no matter what I do”, I do not mean doomed with certainty. I mean that I have a high probability of doom, for any given action, and I cannot find a way to minimise that probability through my own actions.
Thinking about this, I think that you are right. I still consider sadism more likely than beneficence, but I had been setting the prior for indifference too low. This implies that the Matrix Lord has preferences, but these preferences are unknown and possibly unknowable (perhaps he wants to maximise slood).
...
This make the question of which action to best take even more difficult to answer. I do not know anything about slood; I cannot, because it only exists outside the Matrix. The only source of information from outside the Matrix is the Matrix Lord. This implies that, before reaching any decision, I should spend a long time interviewing the Matrix Lord, in an attempt to better be able to model him.
Well, this Matrix Lord seems very interested in decision theory and utilitarianism. Sadistic or not, I expect such a being to respond more favorably to attempts to take the dilemmas he raised seriously than to an epistemic meltdown. Taking the guy at his word and trying to reason your way through the problem is likely to give him more useful data than attempts to rebel or go crazy, and if you’re useful then it’s less likely that he’ll punish you or pull the plug on your universe’s simulation.
It seems reasonably likely that this will lead to a response of ”...alright, I’ve got the data that I wanted, no need to keep this simulation running any longer...” and then pulling the plug on my universe. While it is true that this strategy is likely to lead to a happier Matrix Lord (especially if the data that I give him coincides with the data he expects), I’m not convinced that it leads to a longer existence for my universe.
That may be true too. It depends on the priors we have for generic superhuman agents’ reasons for keeping a simulation running (e.g., having some other science experiments planned, wanting to reward you for providing data...) vs. for shutting it down (e.g., vindictiveness, energy conservation, being interested only in one data point per simulation...).
We do have some data to work with here, since we have experience with the differential effects of power, intelligence, curiosity, etc. among humans. That data is only weakly applicable to such an exotic agent, but it does play a role, so our uncertainty isn’t absolute. My main point was that unusual situations like this don’t call for complete decision-theoretic despair; we still need to make choices, and we can still do so reasonably, though our confidence that the best decision is also a winning decision is greatly diminished.
Well, if I’m going to free-form speculate about the scenario, rather than use it to explore the question it was introduced to explore, the most likely explanation that occurs to me is that the entity is doing the Matrix Lord equivalent of free-form speculating… that is, it’s wondering “what would humans do, given this choice and that information?” And, it being a Matrix Lord, its act of wondering creates a human mind (in this case, mine) and gives it that choice and information.
Which makes it likely that I haven’t actually lived through most of the life I remember, and that I won’t continue to exist much longer than this interaction, and that most of what I think is in the world around me doesn’t actually exist.
That said, I’m not sure what use free-form speculating about such bizarre and underspecified scenarios really is, though I’ll admit it’s kind of fun.
It’s kind of fun. Isn’t that reason enough?
Looking at the original question—i.e. how to handle very large utilities with very small probability—I find that I have a mental safety net there. The safety net says that the situation is a lie. It does not matter how much utility is claimed, because anyone can state any arbitrarily large number, and a number has been chosen (in this case, by the Matrix Lord) in a specific attempt to overwhelm my utility function. The small probability is chosen (a) because I would not believe a larger probability and (b) so that I have no recourse when it fails to happen.
I am reluctant to fiddle with my mental safety nets because, well, they’re safety nets—they’re there for a reason. And in this case, the reason is that such a fantastically unlikely event is unlikely enough that it’s not likely to happen ever, to anyone. Not even once in the whole history of the universe. If I (out of all the hundreds of billions of people in all of history) do ever run across such a situation, then it’s so incredibly overwhelmingly more likely that I am being deceived that I’m far more likely to gain by immediately jumping to the conclusion of ‘deceit’ than by assuming that there’s any chance of this being true.
(nods) Sure. My reply here applies here as well.
Those aren’t “distinguished enough to be written down” before the game is played. I’ll edit to make this slightly clearer hopefully.