Vladimir_Nesov comments on Domesticating reduced impact AIs

Vladimir_Nesov 15 Feb 2013 19:10 UTC
0 points
OK, in the post R(w) can talk about big events w, not just particular worlds (your talking about “integrating across all w” in the post confused me, it now turns out that the possible w are not mutually exclusive). But this doesn’t clarify for me the relevance of your point in the grandparent (what is the relevance of P(wi|aj,X=1) for the estimate of the total penalty?).

(If w2 and w3 are particular worlds, then it’s incorrect that P(w1|a1,X=1), P(w2|a2,X=1), P(w3|a3,X=1), P(w1|X=0) are about 1, because the AI won’t be able to predict what happens if it takes a1, a2 etc. so accurately. If w2 and w3 are partial descriptions of worlds, it is the same thing as them being big events, which is what I’ve been assuming throughout the thread.)
- Stuart_Armstrong 16 Feb 2013 9:21 UTC
  0 points
  Parent
  I don’t need P(w2|a2,X=1) and P(w3|a3,X=1) to be about one (that was a simplified model) - I need them to be about equal. i,e, the disciple is a really smart AI and can take over the world if motivated to do so.