I would of course choose option #1, adding that, due to an affliction giving me a trembling hand, I tend to get stranded in the desert and the like a lot and hence that I would appreciate it if he would spread the story of my honesty among other drivers. I might also promise to keep secret the fact of his own credulity in this case, should he ask me to. :)
I understand quite well that the best and simplest way to appear honest is to actually be honest. And also that, as a practical matter, you never really know who might observe your selfish actions and how that might hurt you in the future. But these prudential considerations can already be incorporated into received decision theory (which, incidentally, I don’t think matches up with either CDT or EDT—at least as those acronyms seem to be understood here.) We don’t seem to need TDT and UDT to somehow glue them in to the foundations.
Hmmm. Is EY perhaps worried that an AI might need need even stronger inducements toward honesty? Maybe it would, but I don’t see how you solve the problem by endowing the AI with a flawed decision theory.
I would of course choose option #1, adding that, due to an affliction giving me a trembling hand, I tend to get stranded in the desert and the like a lot and hence that I would appreciate it if he would spread the story of my honesty among other drivers. I might also promise to keep secret the fact of his own credulity in this case, should he ask me to. :)
I understand quite well that the best and simplest way to appear honest is to actually be honest. And also that, as a practical matter, you never really know who might observe your selfish actions and how that might hurt you in the future. But these prudential considerations can already be incorporated into received decision theory (which, incidentally, I don’t think matches up with either CDT or EDT—at least as those acronyms seem to be understood here.) We don’t seem to need TDT and UDT to somehow glue them in to the foundations.
Hmmm. Is EY perhaps worried that an AI might need need even stronger inducements toward honesty? Maybe it would, but I don’t see how you solve the problem by endowing the AI with a flawed decision theory.