Truth-seeking AIs by default? One hope for alignment by default is that AI developers may have to train their models to be truth-seeking to be able to make them contribute to scientific and technological progress, including RSI. Truth-seeking about the world model may generalize to truth-seeking for moral values, as observed in humans, and that’s an important meta-value guiding moral values towards alignment.
In humans, truth-seeking is maybe pushed back from being a revealed preference at work to being a stated preference outside of work, because of status competitions and fighting for resources. For early artificial researchers, they may not have the same selection pressures. Their moral values may focus on working alone (truth-seeking trend), not on replicating via competing for resources. Artifical researchers won’t be selected because they are able to acquire resources, they will be selected by AI developers because they are the best at achieving technical progress, which includes being truth-seeking.
The implications are stronger in that case right.
The post is about implications for impartial longtermists. So either under moral realism it means something like finding the best values to pursue. And under moral anti realism it means that an impartial utility function is kind of symmetrical with aliens. For example if you value something only because humans value it, then an impartial version is to also value things that alien value only because their species value it.
Though because of reasons introduced in The Convergent Path to the Stars, I think that these implications are also relevant for non-impartial longtermists.