To check the supplied proofs, A needs to be able to recognize statements of the form “A()==a implies U()==u, and A()!=a implies U()<=u”, which seems to require knowing the source code of A. Or do you propose trying to check proofs of some different form instead, that would only mention U? What would such statements look like?
Not when the statements are already given with A() in them, as in the Newcomb’s problem. The newcomb’s problem doesn’t say—here is the omega, and it has a brain sim with such and such neurons connected so and so with such and such weights, and it exposes this brain sim to this entire string, and if this brain sim tells it so and so it puts stuff into one box, otherwise it puts stuff into other box. No, it already says omega predicts you, using the symbol you which already means A() . Nobody has ever presented newcomb to you with an implicit, rather than explicit, self reference.
One could insert a quine, of course, into the algorithm, to allow the algorithm to recognize self; then such algorithm could start off by replacing the ‘brain sim with such and such neurons connected so and so’ with symbol A() in the input statements (or perhaps running a prover there to try and replace any obscured instances of A() ) and then solve it without further using knowledge of internals of A() .
More philosophically, this may be what our so called reflection is about: fully opaque black box substitution. Albeit one must not read much into it, because the non-opaque substitution does not only lead to ‘pollution’ of the theorem prover with contradictions any time you think “okay, suppose I give out answer a”, but is also terribly impractical.
One could also fix it by assuming that your output is slightly uncertain—epsilon probability of changing the mind—which kills the wrong proofs (and also stops the prediction nonsense and other entirely un-physical ideas, with a side effect that the agent might not work ideally in an ideal world).
The usual problem with adding randomness is that all copies of the agent must have perfectly correlated sources of randomness (otherwise you end up with something like CDT), and that amounts to explicitly pointing out all copies of the agent.
Well too bad for all the theories that can’t handle slight randomness, because in our universe you can’t implement an agent which does not have slight uncorrelated randomness to it’s output; that’s just how our universe is.
I don’t understand your comment. Our proof-theoretic algorithms can handle worlds with randomness just fine, because the logical fact of the algorithm’s return value will be more or less correlated with many physical facts. The proof-theoretic algorithms don’t require slight randomness to work, and my grandparent comment gave a reason why it seems hard to write a version that relies completely on slight randomness to kill spurious proofs, as you suggested. Of course if you do find a way, it will be cool.
Well, your grandparent’s comment spoke of a problem with adding randomness, rather than with lack of necessity to add randomness. Maybe i simply misunderstood it.
Note btw that the self description will have to include randomness, to be an accurate description of an imperfect agent.
Let me try to explain. Assume you have an agent that’s very similar to our proof-theoretic ones, but also has a small chance of returning a random action. Moreover, the agent’s self-description includes a mention of the random variable V that makes the agent return a random action.
The first problem is that statements like “A()==a” are no longer logical statements. That seems easy to fix by making the agent look at conditional expected values instead of logical implications.
The second and more serious problem is this: how many independent copies of V does the world contain? Imagine the agent is faced with Newcomb’s Problem. If the world program contains only one copy of V that’s referenced by both instances of the agent, that amounts to abusing the problem formulation to tell the agent “your copies are located here and here”. And if the two copies of the agent use uncorrelated instances of V, then looking at conditional expected values based on V buys you nothing. Figuring out the best action reduces to same logical question that the proof-theoretic algorithms are faced with.
Your comment seems to be an instance of arguing about assumptions. Does your alternative approach lead toward interesting new math? If not, then exploring it doesn’t seem to make sense, even if it’s true, because we already know all there is to know about it.
To check the supplied proofs, A needs to be able to recognize statements of the form “A()==a implies U()==u, and A()!=a implies U()<=u”, which seems to require knowing the source code of A. Or do you propose trying to check proofs of some different form instead, that would only mention U? What would such statements look like?
Not when the statements are already given with A() in them, as in the Newcomb’s problem. The newcomb’s problem doesn’t say—here is the omega, and it has a brain sim with such and such neurons connected so and so with such and such weights, and it exposes this brain sim to this entire string, and if this brain sim tells it so and so it puts stuff into one box, otherwise it puts stuff into other box. No, it already says omega predicts you, using the symbol you which already means A() . Nobody has ever presented newcomb to you with an implicit, rather than explicit, self reference.
One could insert a quine, of course, into the algorithm, to allow the algorithm to recognize self; then such algorithm could start off by replacing the ‘brain sim with such and such neurons connected so and so’ with symbol A() in the input statements (or perhaps running a prover there to try and replace any obscured instances of A() ) and then solve it without further using knowledge of internals of A() .
More philosophically, this may be what our so called reflection is about: fully opaque black box substitution. Albeit one must not read much into it, because the non-opaque substitution does not only lead to ‘pollution’ of the theorem prover with contradictions any time you think “okay, suppose I give out answer a”, but is also terribly impractical.
One could also fix it by assuming that your output is slightly uncertain—epsilon probability of changing the mind—which kills the wrong proofs (and also stops the prediction nonsense and other entirely un-physical ideas, with a side effect that the agent might not work ideally in an ideal world).
The usual problem with adding randomness is that all copies of the agent must have perfectly correlated sources of randomness (otherwise you end up with something like CDT), and that amounts to explicitly pointing out all copies of the agent.
Well too bad for all the theories that can’t handle slight randomness, because in our universe you can’t implement an agent which does not have slight uncorrelated randomness to it’s output; that’s just how our universe is.
I don’t understand your comment. Our proof-theoretic algorithms can handle worlds with randomness just fine, because the logical fact of the algorithm’s return value will be more or less correlated with many physical facts. The proof-theoretic algorithms don’t require slight randomness to work, and my grandparent comment gave a reason why it seems hard to write a version that relies completely on slight randomness to kill spurious proofs, as you suggested. Of course if you do find a way, it will be cool.
Well, your grandparent’s comment spoke of a problem with adding randomness, rather than with lack of necessity to add randomness. Maybe i simply misunderstood it.
Note btw that the self description will have to include randomness, to be an accurate description of an imperfect agent.
Let me try to explain. Assume you have an agent that’s very similar to our proof-theoretic ones, but also has a small chance of returning a random action. Moreover, the agent’s self-description includes a mention of the random variable V that makes the agent return a random action.
The first problem is that statements like “A()==a” are no longer logical statements. That seems easy to fix by making the agent look at conditional expected values instead of logical implications.
The second and more serious problem is this: how many independent copies of V does the world contain? Imagine the agent is faced with Newcomb’s Problem. If the world program contains only one copy of V that’s referenced by both instances of the agent, that amounts to abusing the problem formulation to tell the agent “your copies are located here and here”. And if the two copies of the agent use uncorrelated instances of V, then looking at conditional expected values based on V buys you nothing. Figuring out the best action reduces to same logical question that the proof-theoretic algorithms are faced with.
Your comment seems to be an instance of arguing about assumptions. Does your alternative approach lead toward interesting new math? If not, then exploring it doesn’t seem to make sense, even if it’s true, because we already know all there is to know about it.