So there’s one last fix that occurred to me overnight, but I can’t see whether it could possibly work.
The impasse is that we need a version of FairBot which acts as usual (given enough deductive capacity) AND is able to deduce that it defects mutually against DefectBot. The current problem is that even though it can prove that DefectBot defects, this isn’t enough to prove that FairBot won’t find a proof of DefectBot’s cooperation anyway. (FairBot can’t deduce that FairBot’s deduction procedure is consistent, unless it’s inconsistent!)
A failed fix is a FairBot that simultaneously looks for proofs of the opponent’s cooperation and defection, and acts symmetrically as soon as it finds one (and defects if it doesn’t). The problem is that this algorithm may as easily find a proof of mutual defection against itself as a proof of mutual cooperation; it depends on the proof ordering.
So the last-ditch fix is for FairBot to search for both proofs simultaneously, but somehow search more effectively in the cooperative direction, so that we can guarantee it finds a cooperative proof against itself rather than a defective one. This could be done either by giving it exponentially more search time in the cooperative direction than in the defective direction, or by preparing it with a Löbian constructive approach in the cooperative direction.
The current problem is that even though it can prove that DefectBot defects
How can it do that in the general case?
The problem is that this algorithm may as easily find a proof of mutual defection against itself as a proof of mutual cooperation; it depends on the proof ordering.
Maybe I’m missing something, but wouldn’t that imply that the formal system is inconsistent, and hence useless due to the principle of explosion?
Maybe I’m missing something, but wouldn’t that imply that the formal system is inconsistent, and hence useless due to the principle of explosion?
If program A which uses a specific proof ordering can find a proof of some theorem about program A, and program B which uses a different proof ordering can find a proof of the opposite theorem about program B, that doesn’t imply inconsistency.
He’s saying that program A can find a proof that it cooperates with program A, but if we slightly change the proof ordering in program A and obtain program B, then program B can find a proof that it defects against program B. I still don’t see the inconsistency.
Ok, so the proof ordering is considered part of the program, I assumed it was an external input to be universally quantified. Thanks for the clarification.
Let’s imagine for the moment that we have an increasing function f from the naturals to the naturals, and we instruct our FairBot variant to spend f(N) steps looking only for cooperative proofs (i.e. reacting only if it finds a proof that Opp(FairBot)=C) before spending an Nth step looking for defective ones (proofs of Opp(FairBot)=D). When paired against DefectBot, the simple proof that DefectBot defects will come up at the Kth step of looking for defective proofs, and thus at the (f(K)+K)th step of FairBot’s algorithm.
But then, if FairBot needs the lemma that FairBot defects against DefectBot, this can be proved in g(f(K)+K) steps for some function g. The important part here is that this limit doesn’t blow up as FairBot’s proof limit increases! (It’ll still increase a bit, simply because FairBot needs a slightly longer representation if it’s mentioning a sufficiently large proof limit, but this sort of increase is quite manageable.)
So the real question is, is there a function f such that we can be sure this FairBot variant finds a Löbian proof of cooperation with itself in fewer than f(N) steps, before it finds a Löbian proof of defection against itself in N steps?
You could make f(1) large enough for Loebian cooperation to happen immediately. Then you could make f(2) large enough to simulate what happened in the first f(1)+1 steps, and so on. I still don’t completely understand how you’re going to use this thing, though.
Lemma 1: There’s a proof that M’(DefectBot)=D, and that there does not exist a shorter proof of M’(DefectBot)=C.
Proof: There’s a short proof that DefectBot(DefectBot)=D, and a proof of length (waiting time + short) that FB’(DefectBot)=DefectBot(FB’)=D. (It needs to wait until FB’ stops looking for only cooperative proofs.) Clearly there are no shorter proofs of contrary statements.
So there’s a proof that M’ skips to its default defection (since there’s nothing strictly better), and this proof has length less than (g(short) + g(wait + short) + short), where g is the function that bounds how long it takes to verify that something is the shortest proof of a certain type, in terms of the proof length. (I think g is just bounded by an exponential, but it doesn’t matter).
Thus, of course, Lemma 1 has a proof of length at most g(g(short) + g(wait + short) + short).
Lemma 2: There’s a proof that M’(FB’)=FB’(M’)=C, and that there does not exist a shorter proof of any alternative outcome.
Proof: Similar to Lemma 1. The initial proofs are a bit longer because we need the Löbian proof of length less than (waiting time) that FB’(FB’)=C, instead of the short proof that DefectBot(DefectBot)=D.
And so there’s a proof of length (g(wait+short) + g(wait) + short) that M’ goes to the sanity-check trying to prove that FB’(M’)=FB’(FB’). Now a proof of length L that FB’(M’)=C would imply a proof of length L + (g(wait+short) + g(wait) + 2*short) that M’(FB’)=C...
Oh, and here’s the impasse again, because the existence of a proof that M’(FB’)=C is no longer enough to prove that FB’(M’)=C, you also need that there’s no shorter proof that M’(FB’)=D. Unless there’s a clever way to finesse that (within something provable by the same system), it looks like that’s where this last-ditch fix fails.
So this is setting off all sorts of aesthetic flags in my head, but it appears to work:
Let FairBotPrime (FB’) be the agent that does this: first tries to prove Opp(FB’)=C for a long time (long enough for Löb to work), then simply tries to prove either Opp(FB’)=C or Opp(FB’)=D for the rest of the time. If it ever succeeds, it immediately halts and returns the same move as it just deduced. If it doesn’t deduce anything in time, then it defects by default.
Let MasqueradePrime (M’) be the agent that first tries on the DefectBot and FB’ masks: it looks through all proofs for one of the form “DefectBot(Opp)=X and Opp(DefectBot)=Y” and progresses to the next stage once it finds one (or once it runs out of allotted time), then does the same for the FB’ mask. Then, if either payout thus obtained exceeds the default (D,D) payout, it sanity-checks whichever mask gets the better result (let’s say it defaults to DefectBot if they tie): it tries to prove “Opp(M’)=Opp(chosen mask)”, and if so then it outputs the (pre-calculated) move that the chosen mask makes against the opponent. If something along this path fails, then it just outputs its default move (D).
I claim that M’ correctly defects against DefectBot and CooperateBot, and cooperates mutually with FB’ and M’. The key here is that at no point does it need to deduce that a proof search failed; it only needs to show that a certain proof is the first one that an agent finds. I’ll reply again with some lemmas.
I had a similar idea some months ago, about making the agent spend exponentially more time on proofs that imply higher utility values. Unfortunately such an agent would spend the most time trying to achieve (D,C), because that’s the highest utility outcome. Or maybe I misunderstand your idea...
So there’s one last fix that occurred to me overnight, but I can’t see whether it could possibly work.
The impasse is that we need a version of FairBot which acts as usual (given enough deductive capacity) AND is able to deduce that it defects mutually against DefectBot. The current problem is that even though it can prove that DefectBot defects, this isn’t enough to prove that FairBot won’t find a proof of DefectBot’s cooperation anyway. (FairBot can’t deduce that FairBot’s deduction procedure is consistent, unless it’s inconsistent!)
A failed fix is a FairBot that simultaneously looks for proofs of the opponent’s cooperation and defection, and acts symmetrically as soon as it finds one (and defects if it doesn’t). The problem is that this algorithm may as easily find a proof of mutual defection against itself as a proof of mutual cooperation; it depends on the proof ordering.
So the last-ditch fix is for FairBot to search for both proofs simultaneously, but somehow search more effectively in the cooperative direction, so that we can guarantee it finds a cooperative proof against itself rather than a defective one. This could be done either by giving it exponentially more search time in the cooperative direction than in the defective direction, or by preparing it with a Löbian constructive approach in the cooperative direction.
Do either of these seem feasible?
How can it do that in the general case?
Maybe I’m missing something, but wouldn’t that imply that the formal system is inconsistent, and hence useless due to the principle of explosion?
If program A which uses a specific proof ordering can find a proof of some theorem about program A, and program B which uses a different proof ordering can find a proof of the opposite theorem about program B, that doesn’t imply inconsistency.
Hence he’s assuming that A = B, if I understand correctly.
He’s saying that program A can find a proof that it cooperates with program A, but if we slightly change the proof ordering in program A and obtain program B, then program B can find a proof that it defects against program B. I still don’t see the inconsistency.
Yes, this is what I meant. Thanks!
Ok, so the proof ordering is considered part of the program, I assumed it was an external input to be universally quantified. Thanks for the clarification.
(Sorry, my previous reply was off the mark.)
If the agent spends more time trying to prove cooperation, doesn’t it become harder to prove that the agent defects against DefectBot?
Let’s imagine for the moment that we have an increasing function f from the naturals to the naturals, and we instruct our FairBot variant to spend f(N) steps looking only for cooperative proofs (i.e. reacting only if it finds a proof that Opp(FairBot)=C) before spending an Nth step looking for defective ones (proofs of Opp(FairBot)=D). When paired against DefectBot, the simple proof that DefectBot defects will come up at the Kth step of looking for defective proofs, and thus at the (f(K)+K)th step of FairBot’s algorithm.
But then, if FairBot needs the lemma that FairBot defects against DefectBot, this can be proved in g(f(K)+K) steps for some function g. The important part here is that this limit doesn’t blow up as FairBot’s proof limit increases! (It’ll still increase a bit, simply because FairBot needs a slightly longer representation if it’s mentioning a sufficiently large proof limit, but this sort of increase is quite manageable.)
So the real question is, is there a function f such that we can be sure this FairBot variant finds a Löbian proof of cooperation with itself in fewer than f(N) steps, before it finds a Löbian proof of defection against itself in N steps?
You could make f(1) large enough for Loebian cooperation to happen immediately. Then you could make f(2) large enough to simulate what happened in the first f(1)+1 steps, and so on. I still don’t completely understand how you’re going to use this thing, though.
Lemma 1: There’s a proof that M’(DefectBot)=D, and that there does not exist a shorter proof of M’(DefectBot)=C.
Proof: There’s a short proof that DefectBot(DefectBot)=D, and a proof of length (waiting time + short) that FB’(DefectBot)=DefectBot(FB’)=D. (It needs to wait until FB’ stops looking for only cooperative proofs.) Clearly there are no shorter proofs of contrary statements.
So there’s a proof that M’ skips to its default defection (since there’s nothing strictly better), and this proof has length less than (g(short) + g(wait + short) + short), where g is the function that bounds how long it takes to verify that something is the shortest proof of a certain type, in terms of the proof length. (I think g is just bounded by an exponential, but it doesn’t matter).
Thus, of course, Lemma 1 has a proof of length at most g(g(short) + g(wait + short) + short).
Lemma 2: There’s a proof that M’(FB’)=FB’(M’)=C, and that there does not exist a shorter proof of any alternative outcome.
Proof: Similar to Lemma 1. The initial proofs are a bit longer because we need the Löbian proof of length less than (waiting time) that FB’(FB’)=C, instead of the short proof that DefectBot(DefectBot)=D.
And so there’s a proof of length (g(wait+short) + g(wait) + short) that M’ goes to the sanity-check trying to prove that FB’(M’)=FB’(FB’). Now a proof of length L that FB’(M’)=C would imply a proof of length L + (g(wait+short) + g(wait) + 2*short) that M’(FB’)=C...
Oh, and here’s the impasse again, because the existence of a proof that M’(FB’)=C is no longer enough to prove that FB’(M’)=C, you also need that there’s no shorter proof that M’(FB’)=D. Unless there’s a clever way to finesse that (within something provable by the same system), it looks like that’s where this last-ditch fix fails.
Drat.
So this is setting off all sorts of aesthetic flags in my head, but it appears to work:
Let FairBotPrime (FB’) be the agent that does this: first tries to prove Opp(FB’)=C for a long time (long enough for Löb to work), then simply tries to prove either Opp(FB’)=C or Opp(FB’)=D for the rest of the time. If it ever succeeds, it immediately halts and returns the same move as it just deduced. If it doesn’t deduce anything in time, then it defects by default.
Let MasqueradePrime (M’) be the agent that first tries on the DefectBot and FB’ masks: it looks through all proofs for one of the form “DefectBot(Opp)=X and Opp(DefectBot)=Y” and progresses to the next stage once it finds one (or once it runs out of allotted time), then does the same for the FB’ mask. Then, if either payout thus obtained exceeds the default (D,D) payout, it sanity-checks whichever mask gets the better result (let’s say it defaults to DefectBot if they tie): it tries to prove “Opp(M’)=Opp(chosen mask)”, and if so then it outputs the (pre-calculated) move that the chosen mask makes against the opponent. If something along this path fails, then it just outputs its default move (D).
I claim that M’ correctly defects against DefectBot and CooperateBot, and cooperates mutually with FB’ and M’. The key here is that at no point does it need to deduce that a proof search failed; it only needs to show that a certain proof is the first one that an agent finds. I’ll reply again with some lemmas.
I had a similar idea some months ago, about making the agent spend exponentially more time on proofs that imply higher utility values. Unfortunately such an agent would spend the most time trying to achieve (D,C), because that’s the highest utility outcome. Or maybe I misunderstand your idea...