This is the crippleware version of TDT that pure CDT agents self-modify to. It’s crippleware because if you self-modify at 7:00pm you’ll two-box against an Omega who saw your code at 6:59am.
Penalising a rational agent for its character flaws while it is under construction seems like a rather weak objection.
Most systems have a construction phase during which they may behave imperfectly—so similar objections seem likely to apply to practically any system. However, this is surely no big deal: once a synthetic rational agent exists, we can copy its brain. After that, developmental mistakes would no longer be much of a factor.
It does seem as though this makes CDT essentially correct—in a sense. The main issue would then become one of terminology—of what the word “rational” means. There would be no significant difference over how agents should behave, though.
My reading of this issue is that the case goes against CDT. Its terminology is misleading. I don’t think there’s much of a case that it is wrong, though.
Penalising a rational agent for its character flaws while it is under construction seems like a rather weak objection. Most systems have a construction phase during which they may behave imperfectly—so similar objections seem likely to apply to practically any system. However, this is surely no big deal: once a synthetic rational agent exists, we can copy its brain. After that, developmental mistakes would no longer be much of a factor.
It does seem as though this makes CDT essentially correct—in a sense. The main issue would then become one of terminology—of what the word “rational” means. There would be no significant difference over how agents should behave, though.
My reading of this issue is that the case goes against CDT. Its terminology is misleading. I don’t think there’s much of a case that it is wrong, though.