A TDT agent does not “make promises.” It behaves as if it made promises, when such behavior helps it—and when promises are not necessary and deception suffices, it will do that too.
Isn’t a deceptive agent the hallmark of unfriendly AI? In what scenarios does a dishonest agent reflect a good design?
Of course, I didn’t mean to say that TDT always keeps its promises, just that it is capable of doing so in scenarios like Parfit’s Hitchhiker, where CDT is incapable of doing so.
A TDT agent does not “make promises.” It behaves as if it made promises, when such behavior helps it—and when promises are not necessary and deception suffices, it will do that too.
Isn’t a deceptive agent the hallmark of unfriendly AI? In what scenarios does a dishonest agent reflect a good design?
Of course, I didn’t mean to say that TDT always keeps its promises, just that it is capable of doing so in scenarios like Parfit’s Hitchhiker, where CDT is incapable of doing so.