I had an idea, and was wondering what its fatal flaw was. For UDT, what happens if, instead of proving theorems of the form “actionx --> utilityx” , it proves theorems of the form “PA+actionx |- utilityx”?
At a first glance, this seems to remove the problem of spurious counterfactuals implying any utility value, but there’s probably something big I’m missing.
What happens if it only considers the action if it both failed to find “PA+A()=x” inconsistent and found a proof that PA+A()=x proves U()=x?
Do an inconsistency check first and only consider/compare the action if the inconsistency check fails.
(If PA is strong enough to prove what the agent will do, then PA + “A()=x” is only consistent for one particular action x, and the specific action x for which it is consistent will be up for grabs, depending upon which proof the inconsistency checker finds.)
I had an idea, and was wondering what its fatal flaw was. For UDT, what happens if, instead of proving theorems of the form “actionx --> utilityx” , it proves theorems of the form “PA+actionx |- utilityx”?
At a first glance, this seems to remove the problem of spurious counterfactuals implying any utility value, but there’s probably something big I’m missing.
PA + “A() = x” may be inconsistent (proving, e.g., that 0=1 etc.).
What happens if it only considers the action if it both failed to find “PA+A()=x” inconsistent and found a proof that PA+A()=x proves U()=x? Do an inconsistency check first and only consider/compare the action if the inconsistency check fails.
Then you get spurious proofs of inconsistency :-)
(If PA is strong enough to prove what the agent will do, then PA + “A()=x” is only consistent for one particular action x, and the specific action x for which it is consistent will be up for grabs, depending upon which proof the inconsistency checker finds.)
Got it. Thank you!