I don’t see how you got from 3) S cannot prove that A won’t play chicken to 4) S cannot prove that A()=B(). That inference seems way too general; it should apply to any agent of this type. In particular, S certainly can prove that A()=A(). Perhaps 3 is supposed to break some symmetry between A and B? Perhaps I wasn’t clear that I wanted B to play chicken, too. That restores the symmetry, if that’s what the point of 4 was.
I also think that there is a subtle gap in 5 ⇒ 6. If S could prove that all of A’s actions imply B’s defection, then S could prove that B defects, hence contradiction. But A’s decision doesn’t need to examine all of A’s actions: crying can be omitted. I doubt that this actually helps, though.
I can’t solve your problem yet, but I found a cute lemma. Let P be the proposition “A()=a” where a is the first action inspected in step 1 of A’s algorithm.
(S⊢¬P)⇒P (by inspection of A)
S⊢((S⊢¬P)⇒P) (S can prove 1)
(S⊢(S⊢¬P))⇒(S⊢P) (unwrap 2)
(S⊢¬P)⇒(S⊢(S⊢¬P)) (property of any nice S)
(S⊢¬P)⇒(S⊢P) (combine 3 and 4)
(S⊢¬P)⇒(S⊢(P∧¬P)) (rephrasing of 5)
(S⊢¬P)⇒¬Con(S)
All the above steps can also be formalized within S, so each player knows that if any player plays chicken with the first inspected action, then S is inconsistent. The proof generalizes to the second inspected action, etc., by looking at the first one that yields a contradiction. But if S is inconsistent, then it will make all players play chicken. So if one player plays chicken, then all of them do, and that fact is provable within S.
I had a lot of trouble reading this because in my mind ⇒ binds tighter than ⊢. When I figured it out, I was going to suggest that you use spaces to hint at parsing, but you already did. I don’t know what would have helped.
Since we like symmetry, I’m going to change notation from A and B to I and O for “I” and “opponent.” (or maybe “input” and “output”)
We should be careful about the definition of B. Simply saying that it cooperates if I()=O() causes it to blow up against the defectbot. Instead, consider the propositions PC: I()=C ⇒ O()=C and PD: I()=D ⇒ O()=D. We really mean that B should cooperate if S proves P=PC∧PD. What if it doesn’t? There are several potential agents: B1 defects if S doesn’t prove P; B2 defects if S proves ¬P, but breaks down and cries if it is undecidable; B3 breaks down if either PC and PD are undecidable, but defects they are both decidable and one is false. B3 sounds very similar to A and so I think that symmetry proves that they cooperate together. If we modified A not to require that every action had a provable utility, but only that one action had a utility provably as big as all others, then I think it would cooperate with B2.
These examples increase my assessment of the possibility that A and B1 cooperate.
(I’m ignoring the stuff about playing chicken, because the comment I’m responding to seems to say I can.)
B3 sounds very similar to A and so I think that symmetry proves that they cooperate together. If we modified A not to require that every action had a provable utility, but only that one action had a utility provably as big as all others, then I think it would cooperate with B2.
I think your conclusions can be right, but the proofs are vague. Can you debug your reasoning like you debugged mine above?
I don’t see how you got from 3) S cannot prove that A won’t play chicken to 4) S cannot prove that A()=B(). That inference seems way too general; it should apply to any agent of this type. In particular, S certainly can prove that A()=A(). Perhaps 3 is supposed to break some symmetry between A and B? Perhaps I wasn’t clear that I wanted B to play chicken, too. That restores the symmetry, if that’s what the point of 4 was.
I also think that there is a subtle gap in 5 ⇒ 6. If S could prove that all of A’s actions imply B’s defection, then S could prove that B defects, hence contradiction. But A’s decision doesn’t need to examine all of A’s actions: crying can be omitted. I doubt that this actually helps, though.
I can’t solve your problem yet, but I found a cute lemma. Let P be the proposition “A()=a” where a is the first action inspected in step 1 of A’s algorithm.
(S⊢¬P)⇒P (by inspection of A)
S⊢((S⊢¬P)⇒P) (S can prove 1)
(S⊢(S⊢¬P))⇒(S⊢P) (unwrap 2)
(S⊢¬P)⇒(S⊢(S⊢¬P)) (property of any nice S)
(S⊢¬P)⇒(S⊢P) (combine 3 and 4)
(S⊢¬P)⇒(S⊢(P∧¬P)) (rephrasing of 5)
(S⊢¬P)⇒¬Con(S)
All the above steps can also be formalized within S, so each player knows that if any player plays chicken with the first inspected action, then S is inconsistent. The proof generalizes to the second inspected action, etc., by looking at the first one that yields a contradiction. But if S is inconsistent, then it will make all players play chicken. So if one player plays chicken, then all of them do, and that fact is provable within S.
Did you manage to make any progress?
I tried this in the case that the output of A provably lies in the set {a,b}. I only managed to prove
(S⊢¬P)⇒¬Con(S+Con(S))
where P is the proposition “A()=b” where b is the second inspected action. But this still implies
Thanks! I think you found a real hole and the conclusion is also wrong. Or at least I don’t see how
(S⊢¬P)⇒¬Con(S+Con(S))
implies the conclusion.
The conclusions that I think I can draw are
¬(S⊢A()=a)∧(S⊢A()=b) ⇒ Con(S)∧¬Con(S+Con(S)) ⇒ A()=b
So if one player plays chicken on the second inspected action, then all of them do.
I’m still not getting it. Can you explain the proof of the following:
Con(S)∧¬Con(S+Con(S)) ⇒ A()=b
¬Con(S+Con(S)) ⇒ S⊢¬Con(S)
⇒ S⊢Prv(A()≠a) (if S is inconsistent, then it proves anything)
⇒ S⊢A()=a (because A plays chicken)
⇒ S⊢A()≠b
Con(S)∧¬Con(S+Con(S)) ⇒ Con(S)∧(S⊢A()≠b) (follows directly from 4)
⇒ (S⊬A()=b)∧(S⊢A()≠b) (a consistent theory can’t prove a proposition and its negation)
⇒ (S⊬A()≠a)∧(S⊢A()≠b) (here I’m assuming that S⊢(A()=a∨A()=b). I don’t know what to do if A can choose between more than two alternatives.)
⇒ A()=b (A plays chicken on the second inspected action)
Cool.
I had a lot of trouble reading this because in my mind ⇒ binds tighter than ⊢. When I figured it out, I was going to suggest that you use spaces to hint at parsing, but you already did. I don’t know what would have helped.
Since we like symmetry, I’m going to change notation from A and B to I and O for “I” and “opponent.” (or maybe “input” and “output”)
We should be careful about the definition of B. Simply saying that it cooperates if I()=O() causes it to blow up against the defectbot. Instead, consider the propositions PC: I()=C ⇒ O()=C and PD: I()=D ⇒ O()=D. We really mean that B should cooperate if S proves P=PC∧PD. What if it doesn’t? There are several potential agents: B1 defects if S doesn’t prove P; B2 defects if S proves ¬P, but breaks down and cries if it is undecidable; B3 breaks down if either PC and PD are undecidable, but defects they are both decidable and one is false. B3 sounds very similar to A and so I think that symmetry proves that they cooperate together. If we modified A not to require that every action had a provable utility, but only that one action had a utility provably as big as all others, then I think it would cooperate with B2.
These examples increase my assessment of the possibility that A and B1 cooperate.
(I’m ignoring the stuff about playing chicken, because the comment I’m responding to seems to say I can.)
I think your conclusions can be right, but the proofs are vague. Can you debug your reasoning like you debugged mine above?
Nice catches! You’re right on both points. I have to go to sleep now, will figure this out tomorrow.