Good question! Translating your question to the setting of the logical model, you’re suggesting that instead of using provability in Peano Arithmetic as the criterion for justified action, or provability in PA + Con(PA) (which would have the same difficulty), the agent uses provability under the assumption that its current formal system (which includes PA) is consistent.
Thus, you definitely do not want an agent that makes decisions on the criterion “if I assume that my own deductions are reliable, then can I show that this is the best action?”, at least not until you’ve come up with a heuristic version of this that doesn’t lead to awful self-fulfilling prophecies.
I don’t think he was talking about self-PA, but rather an altered decision criteria, such that rather that “if I can prove this is good, do it” it is “if I can prove that if I am consistent then this is good, do it” which I think doesn’t have this particular problem, though it does have others, and it still can’t /increase/ in proof strength.
I don’t think he was talking about self-PA, but rather an altered decision criteria, such that rather that “if I can prove this is good, do it” it is “if I can prove that if I am consistent then this is good, do it”
Yes.
and it still can’t /increase/ in proof strength.
Mmm, I think I can see it. What about “if I can prove that if a version of me with unbounded computational resources is consistent then this is good, do it”. (*)
It seems to me that this allows increase in proof strength up to the proof strength of that particular ideal reference agent.
(* there should be probably additional constraints that specify that the current agent, and the successor if present, must be provably approximations of the unbounded agent in some conservative way)
“if I can prove that if a version of me with unbounded computational resources is consistent then this is good, do it”
In this formalism we generally assume infinite resources anyway. And even if this is not the case, consistent/inconsistent doesn’t depend on resources, only on the axioms and rules for deduction. So this still doesn’t let you increase in proof strength, although again it should help avoid losing it.
If we are already assuming infinite resources, then do we really need anything stronger than PA?
And even if this is not the case, consistent/inconsistent doesn’t depend on resources, only on the axioms and rules for deduction.
A formal system may be inconsistent, but a resource-bounded theorem prover working on it might never be able to prove any contradiction for a given resource bound. If you increase the resource bound, contradictions may become provable.
Good question! Translating your question to the setting of the logical model, you’re suggesting that instead of using provability in Peano Arithmetic as the criterion for justified action, or provability in PA + Con(PA) (which would have the same difficulty), the agent uses provability under the assumption that its current formal system (which includes PA) is consistent.
Unfortunately, this turns out to be an inconsistent formal system!
Thus, you definitely do not want an agent that makes decisions on the criterion “if I assume that my own deductions are reliable, then can I show that this is the best action?”, at least not until you’ve come up with a heuristic version of this that doesn’t lead to awful self-fulfilling prophecies.
I don’t think he was talking about self-PA, but rather an altered decision criteria, such that rather that “if I can prove this is good, do it” it is “if I can prove that if I am consistent then this is good, do it” which I think doesn’t have this particular problem, though it does have others, and it still can’t /increase/ in proof strength.
Yes.
Mmm, I think I can see it.
What about “if I can prove that if a version of me with unbounded computational resources is consistent then this is good, do it”. (*) It seems to me that this allows increase in proof strength up to the proof strength of that particular ideal reference agent.
(* there should be probably additional constraints that specify that the current agent, and the successor if present, must be provably approximations of the unbounded agent in some conservative way)
“if I can prove that if a version of me with unbounded computational resources is consistent then this is good, do it”
In this formalism we generally assume infinite resources anyway. And even if this is not the case, consistent/inconsistent doesn’t depend on resources, only on the axioms and rules for deduction. So this still doesn’t let you increase in proof strength, although again it should help avoid losing it.
If we are already assuming infinite resources, then do we really need anything stronger than PA?
A formal system may be inconsistent, but a resource-bounded theorem prover working on it might never be able to prove any contradiction for a given resource bound. If you increase the resource bound, contradictions may become provable.