Early investigators in Artificial Intelligence, who were trying to represent all high-level events using primitive tokens in a first-order logic (for reasons of historical stupidity we won’t go into) were stymied by the following apparent paradox:
[Description of a system with the following three theorems: ⊢ ALARM → BURGLAR, ⊢ EARTHQUAKE → ALARM, and ⊢ (EARTHQUAKE & ALARM) → NOT BURGLAR]
Which represents a logical contradiction.
This isn’t a logical contradiction: perhaps what you mean is that we can deduce from this system that EARTHQUAKE is false. This would give us a contradiction in a modal system, if we also had the theorem ⊢ possibly(EARTHQUAKE), but as it stands it isn’t yet contradictory.
You clearly understand this, but I’ll make it explicit for observers:
A → B means that it cannot be the case that A is true and B is false.
E → A means that it cannot be the case that E is true and A is false.
(E & A) → !B means that it cannot be the case that E is true, A is true, and B is true.
Suppose we learn that E is false. We can’t infer anything about A and B, except that it cannot be that A is true and B is false.
Suppose we learn that E is true. By 2, we know that A cannot be false, and so must be true. By 1, we know that B cannot be false. By 3, we know that B cannot be true. B has no possible values, which is a contradiction.
This isn’t a logical contradiction: perhaps what you mean is that we can deduce from this system that EARTHQUAKE is false. This would give us a contradiction in a modal system, if we also had the theorem ⊢ possibly(EARTHQUAKE), but as it stands it isn’t yet contradictory.
E is a sensor reading about reality, and so ⊢ possibly(E) is meant to be implied. (Writing down those three statements on a piece of paper can’t force the earth to stop shaking!)
One of the improvements made to solve this problem was to introduce probability- the idea that instead of treating the links between A, E, and B as deterministic, let’s treat them as stochastic. That’s the Bayesian network idea, and with those it’s harder to get contradictions (you can by misforming your distributions).
The causal model is an improvement even beyond that, because it allows you to deal with interventions in the system. Suppose we know that alarms and burglars are perfectly correlated. This could be either because burglars always set off alarms, or because alarms always attract burglars. If you’re a burglar who would like to steal from a house when there isn’t an earthquake, the difference is important! If you knew which causal system were the case, you could predict what would happen when you steal from the house.
This isn’t a logical contradiction: perhaps what you mean is that we can deduce from this system that EARTHQUAKE is false. This would give us a contradiction in a modal system, if we also had the theorem ⊢ possibly(EARTHQUAKE), but as it stands it isn’t yet contradictory.
You clearly understand this, but I’ll make it explicit for observers:
A → B means that it cannot be the case that A is true and B is false.
E → A means that it cannot be the case that E is true and A is false.
(E & A) → !B means that it cannot be the case that E is true, A is true, and B is true.
Suppose we learn that E is false. We can’t infer anything about A and B, except that it cannot be that A is true and B is false.
Suppose we learn that E is true. By 2, we know that A cannot be false, and so must be true. By 1, we know that B cannot be false. By 3, we know that B cannot be true. B has no possible values, which is a contradiction.
E is a sensor reading about reality, and so ⊢ possibly(E) is meant to be implied. (Writing down those three statements on a piece of paper can’t force the earth to stop shaking!)
One of the improvements made to solve this problem was to introduce probability- the idea that instead of treating the links between A, E, and B as deterministic, let’s treat them as stochastic. That’s the Bayesian network idea, and with those it’s harder to get contradictions (you can by misforming your distributions).
The causal model is an improvement even beyond that, because it allows you to deal with interventions in the system. Suppose we know that alarms and burglars are perfectly correlated. This could be either because burglars always set off alarms, or because alarms always attract burglars. If you’re a burglar who would like to steal from a house when there isn’t an earthquake, the difference is important! If you knew which causal system were the case, you could predict what would happen when you steal from the house.