Skeptityke comments on New paper from MIRI: “Toward idealized decision theory”

Skeptityke 21 Dec 2014 0:19 UTC
1 point
I had an idea, and was wondering what its fatal flaw was. For UDT, what happens if, instead of proving theorems of the form “actionx --> utilityx” , it proves theorems of the form “PA+actionx |- utilityx”?

At a first glance, this seems to remove the problem of spurious counterfactuals implying any utility value, but there’s probably something big I’m missing.
- So8res 21 Dec 2014 1:04 UTC
  1 point
  Parent
  PA + “A() = x” may be inconsistent (proving, e.g., that 0=1 etc.).
  - Skeptityke 21 Dec 2014 1:27 UTC
    1 point
    Parent
    What happens if it only considers the action if it both failed to find “PA+A()=x” inconsistent and found a proof that PA+A()=x proves U()=x? Do an inconsistency check first and only consider/compare the action if the inconsistency check fails.
    - So8res 21 Dec 2014 1:46 UTC
      3 points
      Parent
      Then you get spurious proofs of inconsistency :-)
      
      (If PA is strong enough to prove what the agent will do, then PA + “A()=x” is only consistent for one particular action x, and the specific action x for which it is consistent will be up for grabs, depending upon which proof the inconsistency checker finds.)
      - Skeptityke 21 Dec 2014 2:02 UTC
        1 point
        Parent
        Got it. Thank you!