1) Yes, the solution should be an agent program. It can’t be something as simple as “return 1”, because when I talk about solving the LPP, there’s an implicit desire to have a single agent that solves all problems similar enough to the LPP, for example the version where the agent’s actions 1 and 2 are switched, or where the agent’s source code has some extra whitespace and comments compared to its own quined representation, etc.
2) We imagine the world to be a program with no arguments that returns a utility value, and the agent to be a subprogram within the world program. Even though the return value of an argumentless program is just a constant, the agent can still try to “maximize that constant”, if the agent is ignorant about it in just the right way. For example, if the world program calls the agent program and then returns 0 or 1 depending on whether the agent’s return value was even or odd, and the agent can prove a theorem to that effect by looking at the world program’s source code, then it makes sense for the agent to return an odd value.
Newcomb’s Problem can be formalized as a world program that makes two calls to the agent (or maybe one call to the agent and another call to something provably equivalent). The first call’s return value is used to set the contents of the boxes, and the second one represents the agent’s actual decision. If a smart enough agent receives the world’s source code as an argument (which includes possibly mangled versions of the agent’s source code inside), and the agent knows its own source code by quining), then the agent can prove a theorem saying that one-boxing would logically imply higher utility than two-boxing. That setting is explored in a little more detail here.
Before you ask: no, we don’t know any rigorous definition of what it means to “maximize” the return value of an argumentless program in general. We’re still fumbling with isolated cases, hoping to find more understanding. I’m only marginally less confused than you about the whole field.
3) You can think about an agent as a program that receives the world’s source code as an argument, so that one agent can solve many possible world programs. I usually talk about agents as if they were argumentless functions that had access to the world’s source code via quining, but that’s just to avoid cluttering up the proofs. The results are the same either way.
4) Usually you can assume that S is just Peano arithmetic. You can represent programs by their Gödel numbers, and write a PA predicate saying “program X returns integer Y”. You can also represent statements and proofs in PA by their Gödel numbers, and write a PA predicate saying “proof X is a valid proof of statement Y”. You can implement both these predicates in your favorite programming language, as functions that accept two integers and return a boolean. You can have statements referring to these predicates by their Gödel numbers. The diagonal lemma gives you a generalized way to make statements refer to themselves, and quining allows you to have programs that refer to their own source code. You can have proofs that talk about programs, programs that enumerate and check proofs, and generally go wild.
For example, you can write a program P that enumerates all possible proofs trying to find a valid proof that P itself returns 1, and returns 1 if such a proof is found. To prove that P will in fact return 1 and not loop forever, note that it’s just a restatement of Löb’s theorem. That setting is explored in a little more detail here.
Please let me know if the above makes sense to you!
Thank you. Your comment resolved some of my confusion. While I didn’t understand it entirely, I am happy to have accrued a long list of relevant background reading.
1) Yes, the solution should be an agent program. It can’t be something as simple as “return 1”, because when I talk about solving the LPP, there’s an implicit desire to have a single agent that solves all problems similar enough to the LPP, for example the version where the agent’s actions 1 and 2 are switched, or where the agent’s source code has some extra whitespace and comments compared to its own quined representation, etc.
2) We imagine the world to be a program with no arguments that returns a utility value, and the agent to be a subprogram within the world program. Even though the return value of an argumentless program is just a constant, the agent can still try to “maximize that constant”, if the agent is ignorant about it in just the right way. For example, if the world program calls the agent program and then returns 0 or 1 depending on whether the agent’s return value was even or odd, and the agent can prove a theorem to that effect by looking at the world program’s source code, then it makes sense for the agent to return an odd value.
Newcomb’s Problem can be formalized as a world program that makes two calls to the agent (or maybe one call to the agent and another call to something provably equivalent). The first call’s return value is used to set the contents of the boxes, and the second one represents the agent’s actual decision. If a smart enough agent receives the world’s source code as an argument (which includes possibly mangled versions of the agent’s source code inside), and the agent knows its own source code by quining), then the agent can prove a theorem saying that one-boxing would logically imply higher utility than two-boxing. That setting is explored in a little more detail here.
Before you ask: no, we don’t know any rigorous definition of what it means to “maximize” the return value of an argumentless program in general. We’re still fumbling with isolated cases, hoping to find more understanding. I’m only marginally less confused than you about the whole field.
3) You can think about an agent as a program that receives the world’s source code as an argument, so that one agent can solve many possible world programs. I usually talk about agents as if they were argumentless functions that had access to the world’s source code via quining, but that’s just to avoid cluttering up the proofs. The results are the same either way.
4) Usually you can assume that S is just Peano arithmetic. You can represent programs by their Gödel numbers, and write a PA predicate saying “program X returns integer Y”. You can also represent statements and proofs in PA by their Gödel numbers, and write a PA predicate saying “proof X is a valid proof of statement Y”. You can implement both these predicates in your favorite programming language, as functions that accept two integers and return a boolean. You can have statements referring to these predicates by their Gödel numbers. The diagonal lemma gives you a generalized way to make statements refer to themselves, and quining allows you to have programs that refer to their own source code. You can have proofs that talk about programs, programs that enumerate and check proofs, and generally go wild.
For example, you can write a program P that enumerates all possible proofs trying to find a valid proof that P itself returns 1, and returns 1 if such a proof is found. To prove that P will in fact return 1 and not loop forever, note that it’s just a restatement of Löb’s theorem. That setting is explored in a little more detail here.
Please let me know if the above makes sense to you!
Thank you. Your comment resolved some of my confusion. While I didn’t understand it entirely, I am happy to have accrued a long list of relevant background reading.