suppose we had an AI with a utilitarian utility function of maximizing subjective human well-being (meaning, well-being is not something as simple as physical sensation of “pleasure” and depends on the mental facts of each person) and let us also assume the AI can model this “well” (lets say at least as well as the best of us can deduce the values of another person for their well-being)
You’ve crammed all the difficulty of FAI into this sentence. An additional limit on how much it can manipulate us does little if anything to make this part easier, and adds the additional complication of how strict this limitation should be. The question of how much FAI would manipulate us is an interesting one, but either it’s a small part of the problem or it’s something that will be subsumed in the main question of “what do we want?”. By the latter I mean that we may decide that the best way to decide how much FAI should change our values is to have it calculate our CEV, the same way that the FAI will decide what economic system to implement.
This is not meant to be a resolution to FAI since you can’t stop technology. It’s meant to highlight whether the bad behavior of AI ends up being due to future technology to more directly change humanity. I’m asking the question because the answer to this may give insights as to how to tackle the problem.
You’ve crammed all the difficulty of FAI into this sentence. An additional limit on how much it can manipulate us does little if anything to make this part easier, and adds the additional complication of how strict this limitation should be. The question of how much FAI would manipulate us is an interesting one, but either it’s a small part of the problem or it’s something that will be subsumed in the main question of “what do we want?”. By the latter I mean that we may decide that the best way to decide how much FAI should change our values is to have it calculate our CEV, the same way that the FAI will decide what economic system to implement.
This is not meant to be a resolution to FAI since you can’t stop technology. It’s meant to highlight whether the bad behavior of AI ends up being due to future technology to more directly change humanity. I’m asking the question because the answer to this may give insights as to how to tackle the problem.