Hopefully, you can tell the difference between an alarm you triggered and an alarm that you did not.
I can, and you can, but imagine that we’re trying to program a robot to make decisions in our place, and we can’t trust the robot to have our intuition.* Suppose we give it a utility function that prefers there not being a fire to there being a fire, but don’t give it control over its epistemology (so it can’t just alter its beliefs so it never believes in fires).
If we program it to choose actions which maximize P(O|a) in the two-node system, it’ll shut off the alarm in the hopes that it will make a fire less likely. If we program it to choose actions which maximize P(O|do(a)), it won’t make that mistake.
* People have built-in decision theories for simple problems, and so it often seems strange to demo decision theories on problems small enough that the answer is obvious. But a major point of mathematical decision theories is to enable algorithmic computation of the correct decision in very complicated systems. Medical diagnosis causal graphs can have hundreds, if not thousands, of nodes- and the impact on the network of adjusting some variables might be totally nonobvious. Maybe some symptoms are such that treating them has no effect on the progress of the disorder, whereas other symptoms do have an effect on the progress of the disorder, and there might be symptoms that treating them makes it slightly more likely that the disorder will be cured, but significantly less likely that we can tell if the disorder is cured, and so calculating whether or not that tradeoff is worth it is potentially very complicated.
I can, and you can, but imagine that we’re trying to program a robot to make decisions in our place, and we can’t trust the robot to have our intuition.
A robot would always be able to tell if it’s an alarm it triggered. Humans are the ones that are bad at it. Did you actually decide to smoke because EDT is broken, or are you just justifying it like that and you’re actually doing it because you have smoking lesions?
If we program it to choose actions which maximize P(O|a) in the two-node system, it’ll shut off the alarm in the hopes that it will make a fire less likely.
Once it knows its sensor readings, knowing whether or not it triggers the alarm is no further evidence for or against a fire.
I can, and you can, but imagine that we’re trying to program a robot to make decisions in our place, and we can’t trust the robot to have our intuition.* Suppose we give it a utility function that prefers there not being a fire to there being a fire, but don’t give it control over its epistemology (so it can’t just alter its beliefs so it never believes in fires).
If we program it to choose actions which maximize P(O|a) in the two-node system, it’ll shut off the alarm in the hopes that it will make a fire less likely. If we program it to choose actions which maximize P(O|do(a)), it won’t make that mistake.
* People have built-in decision theories for simple problems, and so it often seems strange to demo decision theories on problems small enough that the answer is obvious. But a major point of mathematical decision theories is to enable algorithmic computation of the correct decision in very complicated systems. Medical diagnosis causal graphs can have hundreds, if not thousands, of nodes- and the impact on the network of adjusting some variables might be totally nonobvious. Maybe some symptoms are such that treating them has no effect on the progress of the disorder, whereas other symptoms do have an effect on the progress of the disorder, and there might be symptoms that treating them makes it slightly more likely that the disorder will be cured, but significantly less likely that we can tell if the disorder is cured, and so calculating whether or not that tradeoff is worth it is potentially very complicated.
A robot would always be able to tell if it’s an alarm it triggered. Humans are the ones that are bad at it. Did you actually decide to smoke because EDT is broken, or are you just justifying it like that and you’re actually doing it because you have smoking lesions?
Once it knows its sensor readings, knowing whether or not it triggers the alarm is no further evidence for or against a fire.