This is an old article, and it’s possible that this question has already been asked, but I’ve been looking through the comments and I can’t find it anywhere. So, here it is:
Why does it matter? If many-worlds is indistinguishable from the Copenhagen Interpretation by any experiment we can think of to do, how does it matter which model we use? If we ever find ourselves in a scenario where it actually does matter which one we use—one where using the wrong model will result in us making some kind of mistake—then we now have an experiment we can do to determine which model is correct. If we never find ourselves in such a position, it doesn’t matter which model we decided on.
When phrased this way, Science doesn’t seem to have such a serious problem. Saying “Traditional Science can lead to incorrect conclusions, but only about things that have no actual effect on the world” doesn’t sound like such a searing criticism.
When phrased this way, Science doesn’t seem to have such a serious problem.
Leaving phrasing aside, it’s worth drawing a distinction between things that have no actual effect on the world and things it’s impractical for me to currently observe.
A process that leads more reliably to correct conclusions about the former is perhaps useless. A process that leads more reliably to correct conclusions about the latter is not.
Unfortunately, this is not what OP argues. There is no hint of suggesting that MWI may be testable some day (which it might be—when done by physicists, not amateurs). The MWI ontology seems to be slowly propagating through the physics community, even Sean Carroll seems to believe it now. Slide 34 basically repeats Eliezer almost verbatim.
Good question—it does not matter. Opinions on untestable questions are about taste, and arguing about taste is a waste of everyone’s time. The “LW consensus” is just wrong to insist on Everett (and about lots of other things, so it should not be too surprising—for example they insist on Bayes, and like EDT).
Obviously LW opinion isn’t monolithic, I merely meant that UDT et al seems to be based on EDT, and lots of folks around here are poking around with UDT. I gave a talk recently at Oxford about why I think basing things on EDT is a bad idea.
I want to watch your talk but videos are slow and the sound quality didn’t seem very good. So I’ll just point out that the point of UDT is to improve upon both EDT and CDT, and it’s wildly mischaracterising LW consensus to say that the interest in UDT suggests that people think EDT is good. They don’t even have much in common, technically. (Besides, even I don’t think EDT is good, and as far as I know I’m the only person who’s really bothered arguing for it.)
No, there are other folks who argue for EDT (I think Paul did). To be fair, I have a standing invitation for any proponent of EDT to sit me down and explain a steelman of EDT to me. This is not meant to trap people but to make progress, and maybe teach me something. The worry is that EDT fans actually haven’t quite realized just how tricky a problem confounding is (and this is a fairly “basic” problem that occurs long before we have to worry about Omega and his kin—gotta walk before you fly).
I would be willing to try to explain such to you, but as you know, I was unsuccessful last time :)
I think you have some un-useful preconception about the capabilities of EDT based on the fact that it doesn’t have “causal” in the name, or causal analysis anywhere directly in the math. Are you familiar with the artificial intelligence model AIXI?
AIXI is capable of causal analysis in much the same way that EDT is. Because although neither of them explicitly include the math of causal analysis, since such math is computable, there are some programs in AIXI’s hypothesis space that do causal analysis. Given some certain amount of data we can expect AIXI to start zooming in on those models and use them for prediction, effectively “learning about causality”.
If we wrote a hypothetical artificially intelligent EDT agent, it could certainly take a similar approach, given a large enough prior space—including, perhaps, all programs, some of which do causal analysis. Of course, in practice we don’t have an infinite amount of time to wait for our math to evaluate every possible program.
It’s slightly more practical to simply furnish your EDT calculation (when trying to calculate such things as your HAART example by hand) with a prior that contains all the standard “causal-ish” conclusions, such as “if there is a data set showing an intervention of type X against subject of type Y results in effect Z, a similar intervention against a similar object probably results in a similar effect”. But even that is extremely impractical, since we are forced to work at the meta-level, with hypothesis spaces including all possible data sets, (current) interventions, subjects and effects.
In real life we don’t really do the above things, we do something much more reasonable, but I hope the above breaks your preconception.
Too high level, what is your actual algorithm for solving decision problems? That is, if I give you a problem, can you give me an answer? An actual problem, not a hypothetical problem. I could even give you actual data if you want, and ask what specific action you will choose. I have a list of problems right here.
If there is some uncomputable theoretical construct that’s not really a serious competitor for doing decision theory. There is no algorithm! We want to build agents that actually act well, remember?
In real life I’m not actually in a position to decide whether to give HAART to a patient. That would be a hypothetical problem. In hypothetical real life, if I were to use EDT, what I would do is use pearl’s causal analysis and some highly unproven assumptions (ie. common sense) to derive a probability distribution over my hypothetical actual situation, and pick the the action with the highest conditional expected utility. This is the “something much more reasonable” that I was alluding to.
The reason I explained all the impractical models above is because you need to understand that using common sense isn’t cheating, or anything illegal. It’s just an optimization, more or less equivalent to actually “furnishing your EDT calculation with a prior that contains all the standard “causal-ish” conclusions”. This is something real decision theorists do every day, because it’s not practical, even for a causal decision theorist, to work at the meta-level including all possible datasets, interventions, effects, etc.
No, you don’t. You’ve pattern matched it to the nearest wrong thing — you’re using causal analysis, you must be secretly using CDT!
If I was using CDT, I would use pearl’s causal analysis and common sense to derive a causal graph over my hypothetical actual situation, and pick the action with the highest interventional expected utility.
This is in fact something decision theorists do every day, because the assumption that a dataset about applying HAART to certain patients has anything at all to say about applying a similar treatment to a similar patient is underlied by lots of commonsense causal reasoning, such as the fact that HAART works by affecting the biology of the human body (and therefore should work the same way in two humans), that it is unaffected by the positions of the stars, because they are not very well causally connected, and so on.
If I was using CDT, I would use pearl’s causal analysis and common sense to derive a causal
graph
When I read the philosophy literature, the way decision theory problems are presented is via examples. For example, smoking lesion is one such example, newcomb’s problem is another. So when I ask you what your decision algorithm is, I am asking for something that (a) you can write down and I can follow step by step (b) that takes these examples as input and (c) produces an output action.
What is your preferred algorithm that satisfies (a), (b), and (c)? Can you write it down for me in a follow up post? If (a) is false, it’s not really an algorithm, if (b) is false, it’s not engaging with the problems people in the literature are struggling with, and if (c) is false, it’s not answering the question! So, for instance, anything based on AIXI is a non-starter because you can’t write it down. Anything that you have not formalized in your head enough to write down is a non-starter also.
I have been talking with you for a long time, and in all this time, never have you actually written down what it is you are using to solve decision problems. I am not sure why—do you actually have something specific in mind or not? I can write down my algorithm, no problem.
Here is the standard causal graph for Newcomb’s problem (note that this is a graph of the agent’s actual situation, not a graph of related historical data):
Given that graph, my CDT solution is to return the action A with highest sum_payoff { U(payoff) P(payoff | do(A), observations) }. Given that graph (you don’t need a causal graph of course), my EDT solution is to return the action A with highest sum_payoff { U(payoff) P(payoff | A, observations) }.
That’s the easy part. Are you asking me for an algorithm to turn a description of Newcomb’s problem in words into that graph? You probably know better than me how to do that.
If I’ve understood this sequence correctly, Eliezer would disagree with you: “Traditional Science can lead to incorrect conclusions, but only about things that have no actual effect on the world” is a serious criticism. He calls out the “let me know when you’ve got a testable prediction” attitude as explicitly wrong.
This is an old article, and it’s possible that this question has already been asked, but I’ve been looking through the comments and I can’t find it anywhere. So, here it is:
Why does it matter? If many-worlds is indistinguishable from the Copenhagen Interpretation by any experiment we can think of to do, how does it matter which model we use? If we ever find ourselves in a scenario where it actually does matter which one we use—one where using the wrong model will result in us making some kind of mistake—then we now have an experiment we can do to determine which model is correct. If we never find ourselves in such a position, it doesn’t matter which model we decided on.
When phrased this way, Science doesn’t seem to have such a serious problem. Saying “Traditional Science can lead to incorrect conclusions, but only about things that have no actual effect on the world” doesn’t sound like such a searing criticism.
Leaving phrasing aside, it’s worth drawing a distinction between things that have no actual effect on the world and things it’s impractical for me to currently observe.
A process that leads more reliably to correct conclusions about the former is perhaps useless.
A process that leads more reliably to correct conclusions about the latter is not.
Unfortunately, this is not what OP argues. There is no hint of suggesting that MWI may be testable some day (which it might be—when done by physicists, not amateurs). The MWI ontology seems to be slowly propagating through the physics community, even Sean Carroll seems to believe it now. Slide 34 basically repeats Eliezer almost verbatim.
Good question—it does not matter. Opinions on untestable questions are about taste, and arguing about taste is a waste of everyone’s time. The “LW consensus” is just wrong to insist on Everett (and about lots of other things, so it should not be too surprising—for example they insist on Bayes, and like EDT).
I know there are lots of people here who argue for EDT or various augmentations of EDT, but I hope that doesn’t count as a LW consensus.
Obviously LW opinion isn’t monolithic, I merely meant that UDT et al seems to be based on EDT, and lots of folks around here are poking around with UDT. I gave a talk recently at Oxford about why I think basing things on EDT is a bad idea.
I want to watch your talk but videos are slow and the sound quality didn’t seem very good. So I’ll just point out that the point of UDT is to improve upon both EDT and CDT, and it’s wildly mischaracterising LW consensus to say that the interest in UDT suggests that people think EDT is good. They don’t even have much in common, technically. (Besides, even I don’t think EDT is good, and as far as I know I’m the only person who’s really bothered arguing for it.)
No, there are other folks who argue for EDT (I think Paul did). To be fair, I have a standing invitation for any proponent of EDT to sit me down and explain a steelman of EDT to me. This is not meant to trap people but to make progress, and maybe teach me something. The worry is that EDT fans actually haven’t quite realized just how tricky a problem confounding is (and this is a fairly “basic” problem that occurs long before we have to worry about Omega and his kin—gotta walk before you fly).
I would be willing to try to explain such to you, but as you know, I was unsuccessful last time :)
I think you have some un-useful preconception about the capabilities of EDT based on the fact that it doesn’t have “causal” in the name, or causal analysis anywhere directly in the math. Are you familiar with the artificial intelligence model AIXI?
AIXI is capable of causal analysis in much the same way that EDT is. Because although neither of them explicitly include the math of causal analysis, since such math is computable, there are some programs in AIXI’s hypothesis space that do causal analysis. Given some certain amount of data we can expect AIXI to start zooming in on those models and use them for prediction, effectively “learning about causality”.
If we wrote a hypothetical artificially intelligent EDT agent, it could certainly take a similar approach, given a large enough prior space—including, perhaps, all programs, some of which do causal analysis. Of course, in practice we don’t have an infinite amount of time to wait for our math to evaluate every possible program.
It’s slightly more practical to simply furnish your EDT calculation (when trying to calculate such things as your HAART example by hand) with a prior that contains all the standard “causal-ish” conclusions, such as “if there is a data set showing an intervention of type X against subject of type Y results in effect Z, a similar intervention against a similar object probably results in a similar effect”. But even that is extremely impractical, since we are forced to work at the meta-level, with hypothesis spaces including all possible data sets, (current) interventions, subjects and effects.
In real life we don’t really do the above things, we do something much more reasonable, but I hope the above breaks your preconception.
Too high level, what is your actual algorithm for solving decision problems? That is, if I give you a problem, can you give me an answer? An actual problem, not a hypothetical problem. I could even give you actual data if you want, and ask what specific action you will choose. I have a list of problems right here.
If there is some uncomputable theoretical construct that’s not really a serious competitor for doing decision theory. There is no algorithm! We want to build agents that actually act well, remember?
In real life I’m not actually in a position to decide whether to give HAART to a patient. That would be a hypothetical problem. In hypothetical real life, if I were to use EDT, what I would do is use pearl’s causal analysis and some highly unproven assumptions (ie. common sense) to derive a probability distribution over my hypothetical actual situation, and pick the the action with the highest conditional expected utility. This is the “something much more reasonable” that I was alluding to.
The reason I explained all the impractical models above is because you need to understand that using common sense isn’t cheating, or anything illegal. It’s just an optimization, more or less equivalent to actually “furnishing your EDT calculation with a prior that contains all the standard “causal-ish” conclusions”. This is something real decision theorists do every day, because it’s not practical, even for a causal decision theorist, to work at the meta-level including all possible datasets, interventions, effects, etc.
Ok, thanks. I understand your position now.
No, you don’t. You’ve pattern matched it to the nearest wrong thing — you’re using causal analysis, you must be secretly using CDT!
If I was using CDT, I would use pearl’s causal analysis and common sense to derive a causal graph over my hypothetical actual situation, and pick the action with the highest interventional expected utility.
This is in fact something decision theorists do every day, because the assumption that a dataset about applying HAART to certain patients has anything at all to say about applying a similar treatment to a similar patient is underlied by lots of commonsense causal reasoning, such as the fact that HAART works by affecting the biology of the human body (and therefore should work the same way in two humans), that it is unaffected by the positions of the stars, because they are not very well causally connected, and so on.
When I read the philosophy literature, the way decision theory problems are presented is via examples. For example, smoking lesion is one such example, newcomb’s problem is another. So when I ask you what your decision algorithm is, I am asking for something that (a) you can write down and I can follow step by step (b) that takes these examples as input and (c) produces an output action.
What is your preferred algorithm that satisfies (a), (b), and (c)? Can you write it down for me in a follow up post? If (a) is false, it’s not really an algorithm, if (b) is false, it’s not engaging with the problems people in the literature are struggling with, and if (c) is false, it’s not answering the question! So, for instance, anything based on AIXI is a non-starter because you can’t write it down. Anything that you have not formalized in your head enough to write down is a non-starter also.
I have been talking with you for a long time, and in all this time, never have you actually written down what it is you are using to solve decision problems. I am not sure why—do you actually have something specific in mind or not? I can write down my algorithm, no problem.
Here is the standard causal graph for Newcomb’s problem (note that this is a graph of the agent’s actual situation, not a graph of related historical data):
Given that graph, my CDT solution is to return the action A with highest
sum_payoff { U(payoff) P(payoff | do(A), observations) }
. Given that graph (you don’t need a causal graph of course), my EDT solution is to return the action A with highestsum_payoff { U(payoff) P(payoff | A, observations) }
.That’s the easy part. Are you asking me for an algorithm to turn a description of Newcomb’s problem in words into that graph? You probably know better than me how to do that.
If I’ve understood this sequence correctly, Eliezer would disagree with you: “Traditional Science can lead to incorrect conclusions, but only about things that have no actual effect on the world” is a serious criticism. He calls out the “let me know when you’ve got a testable prediction” attitude as explicitly wrong.