A simple approach to 5-and-10
To recap, A is trying to deduce enough about U to maximize it:
A := Find some f with a proof of “U = f(A)”, then return argmax f.
If U is A, A can fail by proving f = {(5,5),(10,0)}: If it could prove this, it would be true, for only 5 is checked. Then by Löb’s theorem, it can prove this.
I’ve thrown this problem at some people in my university and a fellow student’s idea led to:
A’ := Find all f with a proof of “U = f(A’)”, then return argmax (their pointwise maximum).
The true f is among the found f. Fooling ourselves into a suboptimal action would require us to believe it at least as good as the true maximum, which the single check refutes.
Note that A’ can’t prove that some f is not found, due to Gödel. Therefore, it can never prove what it is going to do!
With “find some f”, it’s clear when the algorithm should stop. With “find all f” it’s not as clear. If you stop after finding the first f, the problem comes back. If you search for all proofs up to some max length, nobody knows if the problem comes back or not.
With “find all f” it ceases to be a classical algorithm. We search all proofs. For finitely many possible f it could be implemented using a halting oracle, for example. Decision theory approaches needn’t be computable, right?
Then I think you’ve reinvented this. And there were many advances after this, so have heart :-)
I think you’re right, except I don’t think I need the chicken rule. What’s “have heart”? What should I read about the advances? Perhaps problems with each that I should try to use to invent each next one?
Yeah, we figured out sometime later that the chicken rule isn’t needed. Also we switched to using provability logic (a kind of modal logic), because it leads to a more natural formulation than halting oracles. As a bonus, that logic is decidable, so the whole thing can be implemented on a computer.
https://agentfoundations.org/item?id=4
That idea is known as “modal UDT” and I’m happy to have come up with it, because it seems to be a local peak in the space of ideas. But unfortunately that’s as far as it goes. Our world doesn’t look like a modal formula, it’s more like a cellular automaton, so we need decision theories that can work with that. Hence all the research on logical inductors etc.
(Oh, “have heart” was not “stay strong even though you’re 7 years behind”, but “be merciful, I wrote that 7 years ago”.)
Can the failure of this approach to handle scenarios as complicated as cellular automata be formalized using a problem like 5-and-10?
What do you mean by provability being decidable? Couldn’t I ask whether it is provable that a Turing machine halts?
Edit: In Conway’s Game of Life, one can build replicators and Turing machines. If we strewed provability oracle interfaces across the landscape, we should be able to implement this reasoner and it could do things like attempting to maximize the number of gliders. Pitted against a honeycomb maximizer, we could investigate whether they would each render themselves unpredictable to the other through the oracle, or whether the fact that war in Life just sends everything into primordial soup would get them to do a values handshake, etc. Doesn’t sound like modal logic doesn’t apply?
Sorry about the late reply, I’m on vacation and not using the internet much. By have heart I meant stay strong. By decidability I meant this:
https://plato.stanford.edu/entries/logic-provability/#AritComp
The post I linked is based on GL.
In your scenario with two reasoners, I think they will be unpredictable to each other, and both will fail to find enough counterfactuals to choose an action. Modal UDT is single player only.
More generally I think single player decision theory will never make game theory fall out for free—that dream should be abandoned. But that’s a different conversation.
https://arxiv.org/abs/1401.5577 makes me think single player decision theory should be enough.
Consider “B := Have A’ give you a source code X, then execute X.” in modal combat. (Where f may be partial.) Pit B against itself. (D,C) and (C,D) are impossible by symmetry, so assuming our proof system is sound the pointwise maximum of all f will be at most the utility of (C,C). This can be reached by returning “Cooperate”.
Pitting B against FairBot or PrudentBot, they shouldn’t be able to prove B won’t prove he can gain more from defecting, unless they assume B’s consistency, in which case they should cooperate.
I can see B failing to establish corporation with itself when symmetry arguments are barred somehow. Perhaps it could work if B had a probability distribution over what PA+n it expects to be consistent...
For the glider vs honeycomb maximizer, I think the problem is agreeing on what division of the universe counts as (C,C).
Game theory also gives no answer to that problem. That said, I see hope that each could prove something like “We are symmetric enough that if I precommit to take no more than 60% by my measure, he will have precommited to take no more than at most 80% by my measure. Therefore, by precommiting to take no more than 60%, I can know to get at least 20%.”.