This is why I suggested naming FDT “functional decision theory” rather than “algorithmic decision theory”, when MIRI was discussing names.
Suppose that Alice is an LDT Agent and Bob is an Alien Agent. The two swap source code. If Alice can verify that Bob (on the input “Alice’s source code”) behaves the same as Alice in the PD, then Alice will cooperate. This is because Alice sees that the two possibilities are (C,C) and (D,D), and the former has higher utility.
The same holds if Alice is confident in Bob’s relevant conditional behavior for some other reason, but can’t literally view Bob’s source code. Alice evaluates counterfactuals based on “how would Bob behave if I do X? what about if I do Y?”, since those are the differences that can affect utility; knowing the details of Bob’s algorithm doesn’t matter if those details are screened off by Bob’s functional behavior.
The same holds if Alice is confident in Bob’s relevant conditional behavior for some other reason, but can’t literally view Bob’s source code. Alice evaluates counterfactuals based on “how would Bob behave if I do X? what about if I do Y?”, since those are the differences that can affect utility; knowing the details of Bob’s algorithm doesn’t matter if those details are screened off by Bob’s functional behavior.
Hm. What kind of dependence is involved here? Doesn’t seem like a case of subjunctive dependence as defined in the FDT papers; the two algorithms are not related in any way beyond that they happen to be correlated.
Alice evaluates counterfactuals based on “how would Bob behave if I do X? what about if I do Y?”, since those are the differences that can affect utility...
Sure, but so do all agents that subscribe to standard decision theories. The whole DT debate is about what that means.
This is why I suggested naming FDT “functional decision theory” rather than “algorithmic decision theory”, when MIRI was discussing names.
Suppose that Alice is an LDT Agent and Bob is an Alien Agent. The two swap source code. If Alice can verify that Bob (on the input “Alice’s source code”) behaves the same as Alice in the PD, then Alice will cooperate. This is because Alice sees that the two possibilities are (C,C) and (D,D), and the former has higher utility.
The same holds if Alice is confident in Bob’s relevant conditional behavior for some other reason, but can’t literally view Bob’s source code. Alice evaluates counterfactuals based on “how would Bob behave if I do X? what about if I do Y?”, since those are the differences that can affect utility; knowing the details of Bob’s algorithm doesn’t matter if those details are screened off by Bob’s functional behavior.
Hm. What kind of dependence is involved here? Doesn’t seem like a case of subjunctive dependence as defined in the FDT papers; the two algorithms are not related in any way beyond that they happen to be correlated.
Sure, but so do all agents that subscribe to standard decision theories. The whole DT debate is about what that means.