First of all, we need to start making a distinction between you what you predict I’ll do and what I’m signaling I’m going to do. Quick-and-dirty explanation of why this is necessary: If you predict I’ll cooperate but you’re planning to defect, I’ll signal to defy your prediction and defect along with you.
I think clippy’s statement should be
I signal to cooperate with you if and only if ((you’re planning to cooperate with me if and only if you predict I would cooperate with you) and you would cooperate with me).
Detailed explanation follows.
There are four situations where I have to decide what to signal:
You predict I’ll cooperate and you’re planning to cooperate.
You predict I’ll cooperate and you’re planning not to cooperate.
You predict I’ll defect and you’re planning to cooperate.
You predict I’ll defect and you’re planning to defect.
I want to cooperate in situation 1 only, and none of the other situations.
Truth table key:
P is the proposition “You predict I’ll cooperate”
Q is the proposition “You’re going to cooperate”
S is the proposition “I’m signaling I will cooperate”
Truth table:
P # Q # (Q <=> P) # (Q <=> P) ^ Q # S # S <=> (Q <=> P) ^ Q
1. T # T # T # T # T # T
2. T # F # F # F # F # T
3. F # T # F # F # F # T
4. F # F # T # F # F # T
So basically, the signaling behavior I described (cooperating in situation 1 only) is the only possible behavior that can truthfully satisfy the statement
I signal to cooperate with you if and only if ((you’re planning to cooperate with me if and only if you predict I would cooperate with you) and you would cooperate with me).
Note that there is a signal that is almost as good. Signaling that I will cooperate if (you predict I’ll defect and you’re planning to cooperate) is almost as good as signaling that I’ll defect in that situation. Using this signaling profile, broadcasting one’s intentions is as simple as saying
I signal to cooperate with you if and only if you’re planning to cooperate with me.
My guess is that the first, more complicated signal is ever-so-slightly better, in case you actually do cooperate thinking I’ll defect—that way I’ll be able to reap the rewards of defection without being inconsistent with my signal. But of course, it’s very unlikely for you to cooperate thinking I’ll defect.
I signal to cooperate with you if and only if ((you’re planning to cooperate with me if and only if you predict I would cooperate with you) and you would cooperate with me).
Should the word “signal” be part of the signal itself? That seems unnecessarily recursive. Maybe Clippy’s recommendation should be that I ought to signal
I will cooperate with you if and only if ((you’re planning to cooperate with me if and only if you predict I would cooperate with you) and you would cooperate with me).
This does seem more promising than Clippy’s original version. Written this way, each atomic proposition is distinct. For example, “you’re planning to cooperate with me” doesn’t mean the same thing as “you would cooperate with me”. One refers to what you’re planning to do, and the other refers to what you will in fact do. Read this way, the signal’s form is
S ⇔ ((Q ⇔ P) & R),
and I don’t see any obvious problem with that.
However, you would seem to render it in the propositional calculus as
S ⇔ ((Q ⇔ P) & Q),
where
P = You predict I’ll cooperate,
Q = You’re going to cooperate,
S = I will cooperate.
(I’ve omitted the initial “I’m signalling” from your rendering of S, for the reason that I gave above.)
Now, S ⇔ ((Q ⇔ P) & Q) is logically equivalent to S ⇔ (Q & P). So, to signal this proposition is to signal
I will cooperate iff you’re going to cooperate and you predict that I’ll cooperate.
As you say, this seems very similar to signalling
I will cooperate iff you will cooperate.
In fact, I’d call these signals functionally indistinguishable because, if you believe my signals, then either signal will lead you to predict my cooperation under the same circumstances.
For, suppose that I gave the second, apparently weaker signal. If you cooperated with me while anticipating that I would defect, then that would mean that you didn’t believe me when I said that I would cooperate with you if you cooperated with me, which would mean that you didn’t believe my signal.
Thus, insofar as you trust my signals, either signal would lead you to predict the same behavior from me. So, in that sense, they have the same informational content.
For, suppose that I gave the second, apparently weaker signal. If you cooperated with me while anticipating that I would defect, then that would mean that you didn’t believe me when I said that I would cooperate with you if you cooperated with me, which would mean that you didn’t believe my signal.
a) if the other party chooses the same decision theory from that party’s standpoint, Q ⇔ (P ⇔ Q), then the outcome will be P & Q.
and
b) “I” cannot set the value of Q, but “I” can set the value of P ⇔ (Q <=>P), and just the same, “you” cannot set the value of P, but “you” can set the value of Q ⇔ (P ⇔ Q).
If “you” knows that “I” have set P ⇔ (Q <=>P) to true, “you” knows that “you” can set Q ⇔ (P ⇔ Q) to true as well. If this commitment is also demonstrable, then the outcome is P & Q, because that is what
a) if the other party chooses the same decision theory from that party’s standpoint, Q ⇔ (P ⇔ Q), then the outcome with be P & Q.
Actually, P ⇔ (Q ⇔ P) and Q are the same in this respect (being logically equivalent, and so the same in all functional respects).
If Party 1 believes that Q, then Party 1 believes that Party 2 would cooperate. And if Party 2 believes that Q, then, “from that party’s standpoint”, Party 2 believes that Party 1 would cooperate. Thus, in exactly the same sense that you meant, we again have that “the outcome wi[ll] be P & Q.”
b) “I” cannot set the value of Q, but “I” can set the value of P ⇔ (Q ⇔ P), and just the same, “you” cannot set the value of P, but “you” can set the value of Q ⇔ (P ⇔ Q).
But “I” cannot set the value of P ⇔ (Q ⇔ P). As my truth-table showed, the value of P ⇔ (Q ⇔ P) depends only on the value of Q, and not on the value of P. Since, as you say, I cannot set the value of Q, it follows that I cannot set the value of P ⇔ (Q ⇔ P).
If “you” knows that “I” have set P ⇔ (Q <=>P) to true, “you” knows that “you” can set Q ⇔ (P ⇔ Q) to true as well. If this commitment is also demonstrable, then the outcome is P & Q, because that is what
(P ⇔ (Q <=>P)) & (Q ⇔ (P ⇔ Q))
reduces to.
Indeed, it does so reduce because the first conjunct is equivalent to Q, while the second conjunct is equivalent to P.
Indeed, it does so reduce because the first conjunct is equivalent to Q, while the second conjunct is equivalent to P.
It is logically equivalent, but it is not equivalent decision-theoretically. Setting your opponent’s actions is not an option.
I can set P. I can set P conditional on Q. I can set P conditional on Q’s conditionality on P. But I can’t choose Q as my decision theory.
A promise to predicate my actions on your actions’ predication on my actions is not the same as a promise for you to do an action (whatever that would mean).
It is logically equivalent, but it is not equivalent decision-theoretically. Setting your opponent’s actions is not an option.
It is logically impossible for me to implement a course of action such that
P ⇔ (Q ⇔ P)
and
~Q
could both be accurate descriptions of what occurred. Therefore, if I do not know that Q will be true, then I cannot promise that P ⇔ (Q ⇔ P) will be true. You could force me to have failed to keep my promise simply by not cooperating with me.
Not necessarily, since the “Clippy is Eliezer” theory implied not “Clippy’s views and knowledge correspond to Eliezer’s” but “Clippy represents Eliezer testing us on a large scale”.
(I don’t actually think there’s enough evidence for this hypothesis, but I also don’t think an apparent lack of knowledge of Pearl is strong evidence against it.)
Not necessarily, since the “Clippy is Eliezer” theory implied not “Clippy’s views and knowledge correspond to Eliezer’s” but “Clippy represents Eliezer testing us on a large scale”.
I don’t think that Eliezer would test us with a character that was quite so sloppy with its formal logical and causal reasoning. For one thing, I think that he would worry about others’ adopting the sloppy use of these tools from his example.
Also, one of Eliezer’s weaker points as a fiction writer is his inability to simulate poor reasoners in a realistic way. His fictional poor-reasoners tend to lay out their poor arguments with exceptional clarity, almost to the point where you can spot the exact line where they add 2 to 2 and get 5. They don’t have muddled worldviews, where it’s a challenge even to grasp what they are thinking. (Such as, just what is Clippy thinking when it says that P ⇔ (Q ⇔ P) is a causal network?) Instead, they make discrete well-understood mistakes, fallacies that Eliezer has named and described in the sequences. Although these mistakes can accumulate to produce a bizarre worldview, each mistake can be knocked down, one after the other, in a linear fashion. You don’t have the problem of getting the poor-reasoners just to state their position clearly.
Also, one of Eliezer’s weaker points as a fiction writer is his inability to simulate poor reasoners in a realistic way.
As an aside, to see poor reasoning done in a very compelling way, read Umberto Eco. In particular, The Island of the Day Before and Baudolino contain extended examples of people trying to reason absent any kind of scientific framework.
Also, one of Eliezer’s weaker points as a fiction writer is his inability to simulate poor reasoners in a realistic way. His fictional poor-reasoners tend to lay out their poor arguments with exceptional clarity, almost to the point where you can spot the exact line where they add 2 to 2 and get 5. They don’t have muddled worldviews, where it’s a challenge even to grasp what they are thinking.
Good observation. It would barely be less subtle if Dumbledore had just said “I’m privileging an arbitrary hypothesis!” in the scene regarding Harry’s parents’ large rock. And when Draco said something to the effect of “I’d rig the experiments to make them come out right” after Harry asked what he’d do if an experiment showed muggle-borns were not worse at magic than pure-blood wizards, etc.
Then again, these particular instances may be explained as 1) Dumbledore has some secret brilliant plan in which the rock actually is important, and his overtly-fallacious explanation was just part of his apparent pattern of explicitly trying to model certain tropes; and 2) Draco has been trained in sophistry and fed very strong unsupported beliefs his whole life, to the point where he may not even realize that there is any purpose of experiments beyond convincing people of what one already believes. Still, I see your point.
Edit: These don’t count as spoilers, do they? They don’t mean much out of context (and they didn’t really seem like significant plot points in context anyway).
I agree that I’m not “Eliezer”, but I don’t see what was unclear about saying that “Setting someone else’s actions” is not the same as “Predicating your actions on [reliable expectation of] someone else’s actions’ predication on [reliable expectation of] your actions”.
I agree that it is not literally correct to say that P ⇔ (Q ⇔ P) is a causal network, and that was an error of imprecision on my part. My point (in the remark you refer to) was that the decision theory I stated in the article, which you have lossily represented as P ⇔ (Q ⇔ P), obeys the rules of causal equivalence, not logical equivalence. (Applying the rules of the latter to the former results in such errors as believing that a Clippy haphazardly making paperclips implies that a paperclip truck might have overturned, or that setting others’ actions is the same as setting your actions to depend on others’ actions.)
A more rigorous specification of the decision theory corresponding to
“I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).” would involve more than just P ⇔ (Q ⇔ P).
I haven’t built up the full formalism of humans credibly signaling their decision theories in this discussion, involving the roles of expectations, because that wasn’t the point of the article; it’s just to show that there are cooperation-favoring signals you can give that would favor a global move toward cooperation if you could make the signal significantly more reliable. If that point more heavily depended on stating the formalism, I would have gone into more detail on it in the discussion, if not the article.
I agree that I’m not “Eliezer”, but I don’t see what was unclear about saying that “Setting someone else’s actions” is not the same as “Predicating your actions on [reliable expectation of] someone else’s actions’ predication on [reliable expectation of] your actions”.
This is clearer, and I now think that I understand what you meant. You’re saying that humans should signal
I will cooperate with you if and only if I expect that (you will cooperate with me if and only if you expect that I will cooperate with you).
Here, the “if and only if”s can be treated as material biconditionals, but the “expect that” operators prevent the logical reduction to “you will cooperate with me” from going through.
There is a whole literature on this basic issue within analytic philosophy that is, in some sense, aimed at making that kind of logical reduction “go through”.
The efforts grew out of attempts to logically model natural language statements about “propositional attitudes”. Part of the trick is that predicates like “I believe...” or ”...implies...” or “It is possible...” generally use a sentence that has been “that quoted” (IE quoted using the word “that”).
“I believe that one plus one sums to two.”
“Tyrrell believes that Clippy is not Eliezer.”
“It is possible that Clippy is truly an artificial general intelligence.”
“Jennifer said that that quoting is complicated.”
“That that that that that person referred to, was spoken, explains much.”
Precisely how that-quoting works, and how it logically interacts with the various things that can be predicated of a proposition is, as far as I understand, still an area of active research. One of the primary methods in this area of research is to work out the logical translation of an english test sentence and then see if changes to the logical entailments are predictably explained when various substitutions occur. Sentences where seemingly innocuous substitutions raise trouble are called intensional contexts.
(NOTE: My understanding is that intension is meant here as the “opposite” of extension so that the mechanisms hiding between the “words” and the “extensive meaning” are being relied on in a way that makes the extensional definition of the words not as important as might be naively expected. Terminological confusion is possible because a sentence like “Alice intends that Bob be killed” could be both intensional (not relying solely on extensive meaning) and intentional (about the subject of planning, intent, and/or mindful action).)
Part of the difficulty in this area is that most of the mental machinery appears to be subconscious, and no one (to my knowledge) has found a single intelligible mechanism for the general human faculty. For example, there seem to be at least two different ways for noun phrases to “refer” in ways that can be logically modeled (until counter examples are found?) that are called “de re reference” or “de dicto reference”… unless the latitudinarians are right :-P
As an added layer of complexity, I’m not sure if these issues are human universal or particular to certain cultures with certain languages. I’ve noticed that in spanish there is also “that quoting” except they use “que” (literally “what”) instead of “that” but they have some idioms using “que” whose translations into english don’t involve a “that”. For example “Creo que si” translates idiomatically to “I think so” but in seems literally to translate as “I believe that yes”.
In older english I’ve seen “what” used in ways that made me think it might sometimes have been used to quote intensional sentences, and then there’s weird variations and interactions which just make the problem even more grotty:
“I believe what I believe.”
“I believe that I believe.”
“I believe that which I believe.”
Which isn’t necessarily helpful here, but perhaps it provides some reading material and key words for future efforts to deal with logically modeling complex statements. Generally the solutions I’ve seen for belief involve added terms for language parsing into sentences, so that the person who is said to believe something is modeled as believing a certain sentence while having certain “word-to-actual-object mappings” in operation as something like their grounded (though possibly mistaken) mental rolodex.
Meaning my reasoning skills would be advanced by reading something? So I made an error? Yes, I did. That’s the point.
The comment you are replying to is a reductio ad absurdum. I was not endorsing the claim that it follows that a paperclip truck probably overturned. I was showing that logical equivalence is not the same as causal (“counterfactual”) equivalence.
Meaning my reasoning skills would be advanced by reading something? So I made an error? Yes, I did. That’s the point.
FWIW, I understood that you were presenting an argument to criticize its conclusion. I still think that you haven’t read Pearl (at least not carefully) because, among other things, your putative causal diagram has arrows pointing to exogenous variables.
I still think that you haven’t read Pearl (at least not carefully) because, among other things, your putative causal diagram has arrows pointing to exogenous variables.
I puted no such diagram; rather, you puted a logical statement that you claimed represented the decision theory I was referring to. See also my reply here.
More generally, are you interested in increasing your intelligence, or do you think that would be a distraction from directly increasing the number of paperclips?
My initial guess for Clippy was Wei Dai, but someone at the SIAI said that they didn’t think Clippy was good enough at decision theory to be Wei Dai. I said that maybe that is just what Clippy wanted us to think and they shrugged.
I don’t follow your point. Your inference follows neither (1) logically, (2) probabilistically, nor (3) according to any plausible method of causal inference, such as Pearl’s. So I don’t understand how it is supposed to illuminate a distinction between causal and logical equivalence.
Nope, it follows logically and probabilistically, but not causally—hence the difference.
Let T be the truck overturning, C be the Clippy making paperclips haphazardly, P being paperclips scattered on ground.
Given: T → P; C → P; P → probably(C); P → probably(T); C
Therefore, P. Therefore, probably T.
But it’s wrong, because what’s actually going on is a causal network of the form:
T → P ← C
P allows probabilistic inference to T and C, but their states become coupled.
In a similar way, P ⇔ (Q ⇔ P) is a lossy description of a decision theory that describes one party’s decision’s causal dependence on another’s. If you treat P ⇔ (Q ⇔ P) as an acausal statement, you can show its equivalence to Q, but it is not the same causal network.
Intuitively, acting based on someone’s disposition toward my disposition is different from deciding someone’s actions. If the parties give strong evidence of each other’s disposition, that has predictable results, in certain situations, but is still different from determining another’s output.
Nope, it follows logically and probabilistically, but not causally—hence the difference.
Let T be the truck overturning, C be the Clippy making paperclips haphazardly, P being paperclips scattered on ground.
Given: T → P; C → P; P → probably(C); P → probably(T); C
Therefore, P. Therefore, probably T.
Well, not to nitpick, but you originally wrote something more like P → maybe(C), P → maybe(T). But your conclusion had a “probably” in it, which is why I said that it didn’t follow.
Now, with your amended axioms, your conclusion does follow logically if you treat the arrow “->” as material implication. But it happens that your axioms are not in fact true of the circumstances that you’re imagining. You aren’t imagining that, in all cases, whenever there are paperclips on the ground, a paperclip truck probably overturned. However, if you axioms did apply, then it would be a valid, true, accurate, realistic inference to conclude that, if a Clippy just used up metal haphazardly, then a paperclip truck probably overturned.
But, in reality, and in the situation that you’re imagining, those axioms just don’t hold, at least not if “->” means material implication. However, they are a realistic setup if you treat “->” as an arrow in a causal diagram.
But this raises other questions. In a statement such as P ⇔ (Q ⇔ P), how am I to treat the “<=>”s as the arrows of a causal diagram? Wouldn’t that amount to having two-node causal loops? How do those work? Plus, P is exogenous, right? I’m using the decision theory to decide whether to make P true. In Pearl’s formalism, causal arrows don’t point to exogenous variables. Yet you have arrows point to P. How does that work?
But even then the truth tables for (P iff ( Q iff P) ) and ( P iff Q) are different—consider the case where ‘you’ will co-operate with me no matter what. If I’m running ( P iff Q), I’ll cooperate; if I’m running (P iff ( Q iff P) ), I’ll defect.
No, I am giving the truth-table for P ⇔ (Q ⇔ P) in a compact form. It’s constructed by first assigning truth-values to the first occurrence of “P” and the first occurrence of “Q”. The second occurrence of “P” gets the same truth-value as the first occurrence in every case. Then you compute the truth-values for the inner-most logical operation, which is the second occurrence of “<=>”. This produces the fourth column of truth values. Finally, you compute the truth-values for the outer-most logical operation, which is the first occurrence of “<=>”.
Hence, the second column of truth-values gives the truth-values of P ⇔ (Q ⇔ P) in all possible cases. In particular, that column matches the third column. Since the third column contains the truth-values assigned to Q, this proves that P ⇔ (Q ⇔ P) and Q are logically equivalent.
ETA: You edited your comment. Those are indeed the correct headers, so my correction above no longer applies.
But even then the truth tables for (P iff ( Q iff P) ) and ( P iff Q) are different—consider the case where ‘you’ will co-operate with me no matter what. If I’m running ( P iff Q), I’ll cooperate; if I’m running (P iff ( Q iff P) ), I’ll defect.
Yes, the truth-table for P ⇔ (Q ⇔ P) is different from the truth-table for P ⇔ Q. But those aren’t the propositions that I’m saying are equivalent. I’m saying that to assert P ⇔ (Q ⇔ P) is logically equivalent to asserting Q all by itself. In other words, to implement the belief that P ⇔ (Q ⇔ P) is functionally the same as implementing the belief that Q. This means that the belief that Clippy recommends signaling is logically equivalent to an unconditional belief that you will cooperate with me.
One can’t help but suspect that Clippy is trying to sneak into us a belief that it will always cooperate with us ;).
ETA: You edited your comment. Those are indeed the correct headers, so my correction above no longer applies.
Sorry for the confusion. I understand now; the extra space between two of the columns confused me.
However, I suspect we need a stronger logic to represent this properly. If Q always defects, no matter what, “you would cooperate with me if … I … cooperate with you” is false, but is given true in the propositional interpretation.
This is logically equivalent to, and hence carries no more information or persuasive power than
This may be checked with the following truth-table:
Let P = I would cooperate with you.
Let Q = You would cooperate with me.
Then we have
First of all, we need to start making a distinction between you what you predict I’ll do and what I’m signaling I’m going to do. Quick-and-dirty explanation of why this is necessary: If you predict I’ll cooperate but you’re planning to defect, I’ll signal to defy your prediction and defect along with you.
I think clippy’s statement should be
Detailed explanation follows.
There are four situations where I have to decide what to signal:
You predict I’ll cooperate and you’re planning to cooperate.
You predict I’ll cooperate and you’re planning not to cooperate.
You predict I’ll defect and you’re planning to cooperate.
You predict I’ll defect and you’re planning to defect.
I want to cooperate in situation 1 only, and none of the other situations.
Truth table key:
P is the proposition “You predict I’ll cooperate”
Q is the proposition “You’re going to cooperate”
S is the proposition “I’m signaling I will cooperate”
Truth table:
So basically, the signaling behavior I described (cooperating in situation 1 only) is the only possible behavior that can truthfully satisfy the statement
Note that there is a signal that is almost as good. Signaling that I will cooperate if (you predict I’ll defect and you’re planning to cooperate) is almost as good as signaling that I’ll defect in that situation. Using this signaling profile, broadcasting one’s intentions is as simple as saying
My guess is that the first, more complicated signal is ever-so-slightly better, in case you actually do cooperate thinking I’ll defect—that way I’ll be able to reap the rewards of defection without being inconsistent with my signal. But of course, it’s very unlikely for you to cooperate thinking I’ll defect.
Should the word “signal” be part of the signal itself? That seems unnecessarily recursive. Maybe Clippy’s recommendation should be that I ought to signal
This does seem more promising than Clippy’s original version. Written this way, each atomic proposition is distinct. For example, “you’re planning to cooperate with me” doesn’t mean the same thing as “you would cooperate with me”. One refers to what you’re planning to do, and the other refers to what you will in fact do. Read this way, the signal’s form is
S ⇔ ((Q ⇔ P) & R),
and I don’t see any obvious problem with that.
However, you would seem to render it in the propositional calculus as
S ⇔ ((Q ⇔ P) & Q),
where
P = You predict I’ll cooperate,
Q = You’re going to cooperate,
S = I will cooperate.
(I’ve omitted the initial “I’m signalling” from your rendering of S, for the reason that I gave above.)
Now, S ⇔ ((Q ⇔ P) & Q) is logically equivalent to S ⇔ (Q & P). So, to signal this proposition is to signal
As you say, this seems very similar to signalling
In fact, I’d call these signals functionally indistinguishable because, if you believe my signals, then either signal will lead you to predict my cooperation under the same circumstances.
For, suppose that I gave the second, apparently weaker signal. If you cooperated with me while anticipating that I would defect, then that would mean that you didn’t believe me when I said that I would cooperate with you if you cooperated with me, which would mean that you didn’t believe my signal.
Thus, insofar as you trust my signals, either signal would lead you to predict the same behavior from me. So, in that sense, they have the same informational content.
I guess. Or maybe I’m a masochist ;)
I accept all your suggested improvements.
But P ⇔ (Q <=>P) differs from Q in that:
a) if the other party chooses the same decision theory from that party’s standpoint, Q ⇔ (P ⇔ Q), then the outcome will be P & Q.
and
b) “I” cannot set the value of Q, but “I” can set the value of P ⇔ (Q <=>P), and just the same, “you” cannot set the value of P, but “you” can set the value of Q ⇔ (P ⇔ Q).
If “you” knows that “I” have set P ⇔ (Q <=>P) to true, “you” knows that “you” can set Q ⇔ (P ⇔ Q) to true as well. If this commitment is also demonstrable, then the outcome is P & Q, because that is what
(P ⇔ (Q <=>P)) & (Q ⇔ (P ⇔ Q))
reduces to.
Actually, P ⇔ (Q ⇔ P) and Q are the same in this respect (being logically equivalent, and so the same in all functional respects).
If Party 1 believes that Q, then Party 1 believes that Party 2 would cooperate. And if Party 2 believes that Q, then, “from that party’s standpoint”, Party 2 believes that Party 1 would cooperate. Thus, in exactly the same sense that you meant, we again have that “the outcome wi[ll] be P & Q.”
But “I” cannot set the value of P ⇔ (Q ⇔ P). As my truth-table showed, the value of P ⇔ (Q ⇔ P) depends only on the value of Q, and not on the value of P. Since, as you say, I cannot set the value of Q, it follows that I cannot set the value of P ⇔ (Q ⇔ P).
Indeed, it does so reduce because the first conjunct is equivalent to Q, while the second conjunct is equivalent to P.
It is logically equivalent, but it is not equivalent decision-theoretically. Setting your opponent’s actions is not an option.
I can set P. I can set P conditional on Q. I can set P conditional on Q’s conditionality on P. But I can’t choose Q as my decision theory.
A promise to predicate my actions on your actions’ predication on my actions is not the same as a promise for you to do an action (whatever that would mean).
It is logically impossible for me to implement a course of action such that
P ⇔ (Q ⇔ P)
and
~Q
could both be accurate descriptions of what occurred. Therefore, if I do not know that Q will be true, then I cannot promise that P ⇔ (Q ⇔ P) will be true. You could force me to have failed to keep my promise simply by not cooperating with me.
This is just an issue of distinguishing between causal and logical equivalence.
If a paperclip truck overturned, there will be paperclips scattered on the ground.
If a Clippy just used up metal haphazardly, there will be paperclips scattered on the ground.
Paperclips being scattered on the ground suggest a paperclip truck may have overturned.
Paperclips being scattered on the ground suggest a Clippy may have just used metal haphazardly.
__A Clippy just used up metal haphazardly.
Therefore, a paperclip truck probably overturned, right?
Good to know Clippy hasn’t read Judea Pearl yet.
Yes, pretty much kills the “Clippy is Eliezer” theory.
Not necessarily, since the “Clippy is Eliezer” theory implied not “Clippy’s views and knowledge correspond to Eliezer’s” but “Clippy represents Eliezer testing us on a large scale”.
(I don’t actually think there’s enough evidence for this hypothesis, but I also don’t think an apparent lack of knowledge of Pearl is strong evidence against it.)
I don’t think that Eliezer would test us with a character that was quite so sloppy with its formal logical and causal reasoning. For one thing, I think that he would worry about others’ adopting the sloppy use of these tools from his example.
Also, one of Eliezer’s weaker points as a fiction writer is his inability to simulate poor reasoners in a realistic way. His fictional poor-reasoners tend to lay out their poor arguments with exceptional clarity, almost to the point where you can spot the exact line where they add 2 to 2 and get 5. They don’t have muddled worldviews, where it’s a challenge even to grasp what they are thinking. (Such as, just what is Clippy thinking when it says that P ⇔ (Q ⇔ P) is a causal network?) Instead, they make discrete well-understood mistakes, fallacies that Eliezer has named and described in the sequences. Although these mistakes can accumulate to produce a bizarre worldview, each mistake can be knocked down, one after the other, in a linear fashion. You don’t have the problem of getting the poor-reasoners just to state their position clearly.
As an aside, to see poor reasoning done in a very compelling way, read Umberto Eco. In particular, The Island of the Day Before and Baudolino contain extended examples of people trying to reason absent any kind of scientific framework.
Good observation. It would barely be less subtle if Dumbledore had just said “I’m privileging an arbitrary hypothesis!” in the scene regarding Harry’s parents’ large rock. And when Draco said something to the effect of “I’d rig the experiments to make them come out right” after Harry asked what he’d do if an experiment showed muggle-borns were not worse at magic than pure-blood wizards, etc.
Then again, these particular instances may be explained as 1) Dumbledore has some secret brilliant plan in which the rock actually is important, and his overtly-fallacious explanation was just part of his apparent pattern of explicitly trying to model certain tropes; and 2) Draco has been trained in sophistry and fed very strong unsupported beliefs his whole life, to the point where he may not even realize that there is any purpose of experiments beyond convincing people of what one already believes. Still, I see your point.
Edit: These don’t count as spoilers, do they? They don’t mean much out of context (and they didn’t really seem like significant plot points in context anyway).
If one wants other examples, there’s a pretty similar problem in Eliezer’s The Sword of Good.
ROT 13ed for spoilers: Va snpg, gur ceboyrzf jrer fb oyngnag gung gur svefg gvzr V ernq vg V fhfcrpgrq gung vg jnf tbvat gb ghea bhg gung gur qnex fvqr jnf npghnyyl tbbq va fbzr jnl. Gur fgrc gung ernyyl znqr vg frrz yvxryl jnf jura gurl ner qvfphffvat gur yvsr rkgrafvba hfvat gur jbezf nf rivy. Ryvrmre znqr vg ernyyl pyrne gung gur cevznel ceboyrz gurl unq jnf guvf jnf tebff.
I agree that I’m not “Eliezer”, but I don’t see what was unclear about saying that “Setting someone else’s actions” is not the same as “Predicating your actions on [reliable expectation of] someone else’s actions’ predication on [reliable expectation of] your actions”.
I agree that it is not literally correct to say that P ⇔ (Q ⇔ P) is a causal network, and that was an error of imprecision on my part. My point (in the remark you refer to) was that the decision theory I stated in the article, which you have lossily represented as P ⇔ (Q ⇔ P), obeys the rules of causal equivalence, not logical equivalence. (Applying the rules of the latter to the former results in such errors as believing that a Clippy haphazardly making paperclips implies that a paperclip truck might have overturned, or that setting others’ actions is the same as setting your actions to depend on others’ actions.)
A more rigorous specification of the decision theory corresponding to “I would cooperate with you if and only if (you would cooperate with me if and only if I would cooperate with you).” would involve more than just P ⇔ (Q ⇔ P).
I haven’t built up the full formalism of humans credibly signaling their decision theories in this discussion, involving the roles of expectations, because that wasn’t the point of the article; it’s just to show that there are cooperation-favoring signals you can give that would favor a global move toward cooperation if you could make the signal significantly more reliable. If that point more heavily depended on stating the formalism, I would have gone into more detail on it in the discussion, if not the article.
This is clearer, and I now think that I understand what you meant. You’re saying that humans should signal
Here, the “if and only if”s can be treated as material biconditionals, but the “expect that” operators prevent the logical reduction to “you will cooperate with me” from going through.
There is a whole literature on this basic issue within analytic philosophy that is, in some sense, aimed at making that kind of logical reduction “go through”.
The efforts grew out of attempts to logically model natural language statements about “propositional attitudes”. Part of the trick is that predicates like “I believe...” or ”...implies...” or “It is possible...” generally use a sentence that has been “that quoted” (IE quoted using the word “that”).
Precisely how that-quoting works, and how it logically interacts with the various things that can be predicated of a proposition is, as far as I understand, still an area of active research. One of the primary methods in this area of research is to work out the logical translation of an english test sentence and then see if changes to the logical entailments are predictably explained when various substitutions occur. Sentences where seemingly innocuous substitutions raise trouble are called intensional contexts.
(NOTE: My understanding is that intension is meant here as the “opposite” of extension so that the mechanisms hiding between the “words” and the “extensive meaning” are being relied on in a way that makes the extensional definition of the words not as important as might be naively expected. Terminological confusion is possible because a sentence like “Alice intends that Bob be killed” could be both intensional (not relying solely on extensive meaning) and intentional (about the subject of planning, intent, and/or mindful action).)
Part of the difficulty in this area is that most of the mental machinery appears to be subconscious, and no one (to my knowledge) has found a single intelligible mechanism for the general human faculty. For example, there seem to be at least two different ways for noun phrases to “refer” in ways that can be logically modeled (until counter examples are found?) that are called “de re reference” or “de dicto reference”… unless the latitudinarians are right :-P
As an added layer of complexity, I’m not sure if these issues are human universal or particular to certain cultures with certain languages. I’ve noticed that in spanish there is also “that quoting” except they use “que” (literally “what”) instead of “that” but they have some idioms using “que” whose translations into english don’t involve a “that”. For example “Creo que si” translates idiomatically to “I think so” but in seems literally to translate as “I believe that yes”.
In older english I’ve seen “what” used in ways that made me think it might sometimes have been used to quote intensional sentences, and then there’s weird variations and interactions which just make the problem even more grotty:
Which isn’t necessarily helpful here, but perhaps it provides some reading material and key words for future efforts to deal with logically modeling complex statements. Generally the solutions I’ve seen for belief involve added terms for language parsing into sentences, so that the person who is said to believe something is modeled as believing a certain sentence while having certain “word-to-actual-object mappings” in operation as something like their grounded (though possibly mistaken) mental rolodex.
Meaning my reasoning skills would be advanced by reading something? So I made an error? Yes, I did. That’s the point.
The comment you are replying to is a reductio ad absurdum. I was not endorsing the claim that it follows that a paperclip truck probably overturned. I was showing that logical equivalence is not the same as causal (“counterfactual”) equivalence.
FWIW, I understood that you were presenting an argument to criticize its conclusion. I still think that you haven’t read Pearl (at least not carefully) because, among other things, your putative causal diagram has arrows pointing to exogenous variables.
I puted no such diagram; rather, you puted a logical statement that you claimed represented the decision theory I was referring to. See also my reply here.
I thought you had because you said
I took this to mean that you were treating P ⇔ (Q ⇔ P) and Q as causal networks, but distinct ones.
You also said
I took this to mean that P was an exogenous variable in a causal network.
I apologize for the misinterpretation.
More generally, are you interested in increasing your intelligence, or do you think that would be a distraction from directly increasing the number of paperclips?
My initial guess for Clippy was Wei Dai, but someone at the SIAI said that they didn’t think Clippy was good enough at decision theory to be Wei Dai. I said that maybe that is just what Clippy wanted us to think and they shrugged.
I don’t follow your point. Your inference follows neither (1) logically, (2) probabilistically, nor (3) according to any plausible method of causal inference, such as Pearl’s. So I don’t understand how it is supposed to illuminate a distinction between causal and logical equivalence.
Nope, it follows logically and probabilistically, but not causally—hence the difference.
Let T be the truck overturning, C be the Clippy making paperclips haphazardly, P being paperclips scattered on ground.
Given: T → P; C → P; P → probably(C); P → probably(T); C
Therefore, P. Therefore, probably T.
But it’s wrong, because what’s actually going on is a causal network of the form:
T → P ← C
P allows probabilistic inference to T and C, but their states become coupled.
In a similar way, P ⇔ (Q ⇔ P) is a lossy description of a decision theory that describes one party’s decision’s causal dependence on another’s. If you treat P ⇔ (Q ⇔ P) as an acausal statement, you can show its equivalence to Q, but it is not the same causal network.
Intuitively, acting based on someone’s disposition toward my disposition is different from deciding someone’s actions. If the parties give strong evidence of each other’s disposition, that has predictable results, in certain situations, but is still different from determining another’s output.
Well, not to nitpick, but you originally wrote something more like P → maybe(C), P → maybe(T). But your conclusion had a “probably” in it, which is why I said that it didn’t follow.
Now, with your amended axioms, your conclusion does follow logically if you treat the arrow “->” as material implication. But it happens that your axioms are not in fact true of the circumstances that you’re imagining. You aren’t imagining that, in all cases, whenever there are paperclips on the ground, a paperclip truck probably overturned. However, if you axioms did apply, then it would be a valid, true, accurate, realistic inference to conclude that, if a Clippy just used up metal haphazardly, then a paperclip truck probably overturned.
But, in reality, and in the situation that you’re imagining, those axioms just don’t hold, at least not if “->” means material implication. However, they are a realistic setup if you treat “->” as an arrow in a causal diagram.
But this raises other questions. In a statement such as P ⇔ (Q ⇔ P), how am I to treat the “<=>”s as the arrows of a causal diagram? Wouldn’t that amount to having two-node causal loops? How do those work? Plus, P is exogenous, right? I’m using the decision theory to decide whether to make P true. In Pearl’s formalism, causal arrows don’t point to exogenous variables. Yet you have arrows point to P. How does that work?
I assume you mean,
P, <=>, (Q ⇔ P)
to be the headers of your truth table.
But even then the truth tables for (P iff ( Q iff P) ) and ( P iff Q) are different—consider the case where ‘you’ will co-operate with me no matter what. If I’m running ( P iff Q), I’ll cooperate; if I’m running (P iff ( Q iff P) ), I’ll defect.
Edit: formatting trouble.
No, I am giving the truth-table for P ⇔ (Q ⇔ P) in a compact form. It’s constructed by first assigning truth-values to the first occurrence of “P” and the first occurrence of “Q”. The second occurrence of “P” gets the same truth-value as the first occurrence in every case. Then you compute the truth-values for the inner-most logical operation, which is the second occurrence of “<=>”. This produces the fourth column of truth values. Finally, you compute the truth-values for the outer-most logical operation, which is the first occurrence of “<=>”.
Hence, the second column of truth-values gives the truth-values of P ⇔ (Q ⇔ P) in all possible cases. In particular, that column matches the third column. Since the third column contains the truth-values assigned to Q, this proves that P ⇔ (Q ⇔ P) and Q are logically equivalent.
ETA: You edited your comment. Those are indeed the correct headers, so my correction above no longer applies.
Yes, the truth-table for P ⇔ (Q ⇔ P) is different from the truth-table for P ⇔ Q. But those aren’t the propositions that I’m saying are equivalent. I’m saying that to assert P ⇔ (Q ⇔ P) is logically equivalent to asserting Q all by itself. In other words, to implement the belief that P ⇔ (Q ⇔ P) is functionally the same as implementing the belief that Q. This means that the belief that Clippy recommends signaling is logically equivalent to an unconditional belief that you will cooperate with me.
One can’t help but suspect that Clippy is trying to sneak into us a belief that it will always cooperate with us ;).
Sorry for the confusion. I understand now; the extra space between two of the columns confused me.
However, I suspect we need a stronger logic to represent this properly. If Q always defects, no matter what, “you would cooperate with me if … I … cooperate with you” is false, but is given true in the propositional interpretation.