Sune comments on False thermodynamic miracles

Sune 20 Jul 2015 13:08 UTC
0 points
Couldn’t you just send one bit X (1 means on, 0 means off) which is most likely 1 but could turn into 0 due to noise and define the utility u* in the same way as for corrigibility? That is,

u*(A_1,0,A_2)= u(A_1,0,A_2)

u*(A_1,1,A_2)=u(A_1,1,A_2)+E_{A_2′} u(A_1,0,A_2′)- E_{A_2′} u(A_1,1,A_2′)

Here A_1 denotes what happens in the world before the signal is sent, and A_2 what happens afterwards. This way you only use 1 bit rather than 100 and there is no longer a contribution of 2^{-100} from the case where there is a thermodynamic miracle that turns the on-signal into the on-signal (and you don’t have to worry about the distribution of the signal given a thermodynamic miracle). The oracle will optimize u given that X=0 until X is revealed. When that is revealed, we will most likely have X=1, and the oracle will optimize u given X=1 (if the oracle is still running). Does the above idea achieve something more?
- Stuart_Armstrong 23 Jul 2015 11:55 UTC
  0 points
  Parent
  This seems to work. The difference is how the revelation of X is handled. I’m not sure which approach is better, or in which circumstances.