One caution worth noting here is that “trustworthiness” and “altruism” may not be traits that are stable across different situations. As I noted in this post, there’s good reason to think human behavior evolved to follow conditional rules, so observed trustworthiness and altruism under some conditions may be very poor evidence of Friendliness for superintelligence-coding purposes.
One caution worth noting here is that “trustworthiness” and “altruism” may not be traits that are stable across different situations. As I noted in this post, there’s good reason to think human behavior evolved to follow conditional rules, so observed trustworthiness and altruism under some conditions may be very poor evidence of Friendliness for superintelligence-coding purposes.