Their motivation (or what they care about) should be in line with their rationality. This doesn’t happen with humans because we have evolutionarily selected and primitive motivations, coupled with a weak rationality, but should not happen with much more intelligent and designed (possibly self-modifying) agents. Logically, one should care about what one’s rationality tells.
Since we can’t built superintelligences straight off, we have to build self-improving AIs.
A rational self-improving AI has to be motivated to become more intelligent, rational, and so on.
So rational self-improving AIs won’t have arbitrary motivations. They will be motivated to value rationality in order to become more rational.
Valuing rationality means disvaluing bias and partiality.
Therefore, a highly rational agent would not arbitrarily disregard valid rational arguments (we don’t expect
highly rational humans to say “that is a perfectly good argument, but I am going to just ignore it”).
Therefore, a highly rational agent would not arbitrarily disregard valid rational arguments for morality.
Therefore, a highly rational agent would no “just not care”. The only possible failure modes are:
1) Non existence of good rational arguments for morality (failure of objective moral cognitivism).
2) Failure of Intrinsic Motivation arising from their conceptual understanding of valid arguments for morality, ie
they understand that X is good, that they should do X, and what “should” means, but none of that adds up to a motivation to do X.
Yet, as the eminent philosopher Jos Whedon observed, “Yeah… but [they] don’t care!”
Their motivation (or what they care about) should be in line with their rationality. This doesn’t happen with humans because we have evolutionarily selected and primitive motivations, coupled with a weak rationality, but should not happen with much more intelligent and designed (possibly self-modifying) agents. Logically, one should care about what one’s rationality tells.
Since we can’t built superintelligences straight off, we have to build self-improving AIs.
A rational self-improving AI has to be motivated to become more intelligent, rational, and so on.
So rational self-improving AIs won’t have arbitrary motivations. They will be motivated to value rationality in order to become more rational.
Valuing rationality means disvaluing bias and partiality.
Therefore, a highly rational agent would not arbitrarily disregard valid rational arguments (we don’t expect highly rational humans to say “that is a perfectly good argument, but I am going to just ignore it”).
Therefore, a highly rational agent would not arbitrarily disregard valid rational arguments for morality.
Therefore, a highly rational agent would no “just not care”. The only possible failure modes are:
1) Non existence of good rational arguments for morality (failure of objective moral cognitivism).
2) Failure of Intrinsic Motivation arising from their conceptual understanding of valid arguments for morality, ie they understand that X is good, that they should do X, and what “should” means, but none of that adds up to a motivation to do X.