Yes, though I have not thought deeply (hat tip to Jonah :-D) about them.
The idea of decoupling AI beliefs from AI actions looks bad to me on its face. I expect it to introduce a variety of unpleasant failure modes (“of course I fully believe in CEV, it’s just that I’m going to act differently...”) and general fragility. And even if one of utility functions is “do not care about anything but miracles” I still think it’s just going to lead to a catatonic state, is all.
Have you read the two article I linked to, explaining the general principle?
Yes, though I have not thought deeply (hat tip to Jonah :-D) about them.
The idea of decoupling AI beliefs from AI actions looks bad to me on its face. I expect it to introduce a variety of unpleasant failure modes (“of course I fully believe in CEV, it’s just that I’m going to act differently...”) and general fragility. And even if one of utility functions is “do not care about anything but miracles” I still think it’s just going to lead to a catatonic state, is all.