Also: I think making sure our agents are DDT is probably going to be approximately as difficult as making them aligned. Related: Your handle for anthropic uncertainty is:
never reason about anthropic uncertainty. DDT agents always think they know who they are.
“Always think they know who they are” doesn’t cut it; you can think you know you’re in a simulation. I think a more accurate version would be something like “Always think that you are on an original planet, i.e. one in which life appeared ‘naturally,’ rather than a planet in the midst of some larger interstellar civilization, or a simulation of a planet, or whatever. Basically, you need to believe that you were created by humans but that no intelligence played a role in the creation and/or arrangement of the humans who created you. Or… no role other than the “normal” one in which parents create offspring, governments create institutions, etc. I think this is a fairly specific belief, and I don’t think we have the ability to shape our AIs beliefs with that much precision, at least not yet.
Also: I think making sure our agents are DDT is probably going to be approximately as difficult as making them aligned. Related: Your handle for anthropic uncertainty is:
“Always think they know who they are” doesn’t cut it; you can think you know you’re in a simulation. I think a more accurate version would be something like “Always think that you are on an original planet, i.e. one in which life appeared ‘naturally,’ rather than a planet in the midst of some larger interstellar civilization, or a simulation of a planet, or whatever. Basically, you need to believe that you were created by humans but that no intelligence played a role in the creation and/or arrangement of the humans who created you. Or… no role other than the “normal” one in which parents create offspring, governments create institutions, etc. I think this is a fairly specific belief, and I don’t think we have the ability to shape our AIs beliefs with that much precision, at least not yet.