OK, firstly I found an error in what I wrote, namely condition (3) was missing a piece which I erroneously thought was redundant. Fixed it in both posts.
In that case, I’m guessing that such an agent is vulnerable to the procrastination paradox.
It surely is. However, the problem goes away once we restrict to goals G such that given a time sequence {a_m} that doesn’t satisfy G, there is an initial segment {a_m}_{m<N} s.t. any sequence beginning with this initial segment also doesn’t satisfy G. For example, assume the “motor action space” consists of natural numbers and the goal is as follows: either a_m = m^2+1 for all m or a_m = 0 for the first m violating the previous equation. 0 serves us as the “fallback action”. The agent is guaranteed to achieve any goal of this form.
I’d love to see a walkthrough (first in the previous system, then in the ordinal system) of how the proof goes through with a goal of the form “there exists a time at which the button becomes pressed” where the parent agent is attempting to construct a perfect copy of itself and where the fallback action 0 is “press the button”. Could you provide one? I’m still a bit unclear about how the system works, and I think the walkthrough would help.
I’m guessing that the unclear part is how are conditions (3) and (4) satisfied. I’ll use the goal I described in the previous paragraph to avoid the procrastination paradox (otherwise the tiling still happens but the goal isn’t guaranteed to be really achieved). Let’s start with the “finite” system.
The sequence Am is recursively defined within T_κ by A_m() = a_m = (a_m, A_{m+1}). Here a_m are the motor actions which are used to define G.
Condition (4) is T_κ | (A0() = b_0 and B1() = 0) → G. Because of the way fallback works with this goal, we need to introduce the further assumption that b_0 = 0^2+1 = 1 where b_0 = (b_0, B0). The T_κ-proof of (4) now goes as follows. A0() = b_0 implies A1 = B1. B1() = 0 implies A1() = 0 implies a_1 = 0. A0() = b_0 also implies a_0 = b_0 = 1. a_0 = 1 and a_1 = 0 imply G by definition of G.
Condition (3) is T_κ | For all b_1: (B1() = b_1 → (b_1 = 0 or Exists n: (κ > n and @_{T_κ} “(κ+1 > n and B1()[κ\κ+1] = b_1) → G[κ\κ+1]”))). This condition is T_κ-proved by inspecting B1′s code. B1 looks for T_κ-proofs of theorems of the form “κ > n-1 and B1()[κ\κ+1] = b_1 → G[κ\κ+1]”. It is licensed to act on such a proof only when the oracle affirms the stronger premise κ > n. Note that G[κ\κ+1] looks the same as G except that the oracle used in evaluating Am() is “twisted” (in a way that doesn’t affect the secret real implementation of the oracle). B1 performs fallback action 0 when it fails to find any proof as above. In practice, if a_0=(0^2+1=1,B1) then B1 is able to prove it (for example by running A0; more precisely since its goal is twisted it has to run a twisted A0 but it doesn’t change the result). It can then go on to prove that a suitable motor action and a suitable successor agent will lead to the goal (since B1 and G are twisted in the same way within the theorem). If B1 is an “L-zombie” i.e. it wasn’t really constructed by A0 then it a priori might fail to find a suitable proof in which case it will fallback.
I’ll stop at this point. If this clarification was insufficient, please point me in the direction of the problem. If this clarification was sufficient for the “finite” system but the analogical argument for the ordinal system is unclear, tell me and I will produce it.
OK, firstly I found an error in what I wrote, namely condition (3) was missing a piece which I erroneously thought was redundant. Fixed it in both posts.
It surely is. However, the problem goes away once we restrict to goals G such that given a time sequence {a_m} that doesn’t satisfy G, there is an initial segment {a_m}_{m<N} s.t. any sequence beginning with this initial segment also doesn’t satisfy G. For example, assume the “motor action space” consists of natural numbers and the goal is as follows: either a_m = m^2+1 for all m or a_m = 0 for the first m violating the previous equation. 0 serves us as the “fallback action”. The agent is guaranteed to achieve any goal of this form.
I’m guessing that the unclear part is how are conditions (3) and (4) satisfied. I’ll use the goal I described in the previous paragraph to avoid the procrastination paradox (otherwise the tiling still happens but the goal isn’t guaranteed to be really achieved). Let’s start with the “finite” system.
The sequence Am is recursively defined within T_κ by A_m() = a_m = (a_m, A_{m+1}). Here a_m are the motor actions which are used to define G.
Condition (4) is T_κ | (A0() = b_0 and B1() = 0) → G. Because of the way fallback works with this goal, we need to introduce the further assumption that b_0 = 0^2+1 = 1 where b_0 = (b_0, B0). The T_κ-proof of (4) now goes as follows. A0() = b_0 implies A1 = B1. B1() = 0 implies A1() = 0 implies a_1 = 0. A0() = b_0 also implies a_0 = b_0 = 1. a_0 = 1 and a_1 = 0 imply G by definition of G.
Condition (3) is T_κ | For all b_1: (B1() = b_1 → (b_1 = 0 or Exists n: (κ > n and @_{T_κ} “(κ+1 > n and B1()[κ\κ+1] = b_1) → G[κ\κ+1]”))). This condition is T_κ-proved by inspecting B1′s code. B1 looks for T_κ-proofs of theorems of the form “κ > n-1 and B1()[κ\κ+1] = b_1 → G[κ\κ+1]”. It is licensed to act on such a proof only when the oracle affirms the stronger premise κ > n. Note that G[κ\κ+1] looks the same as G except that the oracle used in evaluating Am() is “twisted” (in a way that doesn’t affect the secret real implementation of the oracle). B1 performs fallback action 0 when it fails to find any proof as above. In practice, if a_0=(0^2+1=1,B1) then B1 is able to prove it (for example by running A0; more precisely since its goal is twisted it has to run a twisted A0 but it doesn’t change the result). It can then go on to prove that a suitable motor action and a suitable successor agent will lead to the goal (since B1 and G are twisted in the same way within the theorem). If B1 is an “L-zombie” i.e. it wasn’t really constructed by A0 then it a priori might fail to find a suitable proof in which case it will fallback.
I’ll stop at this point. If this clarification was insufficient, please point me in the direction of the problem. If this clarification was sufficient for the “finite” system but the analogical argument for the ordinal system is unclear, tell me and I will produce it.