I think Eliezer’s reply (point ‘(B)’) to this comment by Wei Dai provides some explanation, as to what the decision theory is doing here.
From the reply (concerning UDT):
I still think [an AI ought to be able to come up with these ideas by itself], BTW. We should devote some time and resources to thinking about how we are solving these problems (and coming up with questions in the first place). Finding that algorithm is perhaps more important than finding a reflectively consistent decision algorithm, if we don’t want an AI to be stuck with whatever mistakes we might make.
And yet you found a reflectively consistent decision algorithm long before you found a decision-system-algorithm-finding algorithm. That’s not coincidence. The latter problem is much harder. I suspect that even an informal understanding of parts of it would mean that you could find timeless decision theory as easily as falling backward off a tree—you just run the algorithm in your own head. So with vey high probability you are going to start seeing through the object-level problems before you see through the meta ones. Conversely I am EXTREMELY skeptical of people who claim they have an algorithm to solve meta problems but who still seem confused about object problems. Take metaethics, a solved problem: what are the odds that someone who still thought metaethics was a Deep Mystery could write an AI algorithm that could come up with a correct metaethics? I tried that, you know, and in retrospect it didn’t work.
The meta algorithms are important but by their very nature, knowing even a little about the meta-problem tends to make the object problem much less confusing, and you will progress on the object problem faster than on the meta problem. Again, that’s not saying the meta problem is important. It’s just saying that it’s really hard to end up in a state where meta has really truly run ahead of object, though it’s easy to get illusions of having done so.
I think Eliezer’s reply (point ‘(B)’) to this comment by Wei Dai provides some explanation, as to what the decision theory is doing here.
From the reply (concerning UDT):