Or consider the conflict “I really enjoy dunking on the outgroup (but have some niggling sense of unease about this)” — we can’t conclude from the fact that the enjoyment of dunking is loud, whereas the niggling doubt is quiet, that the dunking-on-the-outgroup value will be the one left standing after reflection.
Assuming shard theory is basically correct, this aspect of Nate’s story can be resolved by viewing self-reflection as a context like any other. If you put the system in a training setup which causes it to self-reflect, and reward it when it comes to the ‘more diamonds’ conclusion, then this should cause it to reflectively want more diamonds.
The only question is, how much does training it to max diamonds in maze finding cause the ‘max diamonds’ shard to be activated while in the self-reflecting context?
Also, notably, it will definitely be doing a modicum of self-reflection during the normal course of training, as the shards which do self-reflection will steer the future towards locations which reinforce their weight.
Assuming shard theory is basically correct, this aspect of Nate’s story can be resolved by viewing self-reflection as a context like any other. If you put the system in a training setup which causes it to self-reflect, and reward it when it comes to the ‘more diamonds’ conclusion, then this should cause it to reflectively want more diamonds.
The only question is, how much does training it to max diamonds in maze finding cause the ‘max diamonds’ shard to be activated while in the self-reflecting context?
Also, notably, it will definitely be doing a modicum of self-reflection during the normal course of training, as the shards which do self-reflection will steer the future towards locations which reinforce their weight.