AIs are avoiding doing things that would have bad impacts on reflection of many people
Does this mean that the AI would refuse to help organize meetings of a political or religious group that most people think is misguided? That would seem pretty bad to me.
I agree that “There is no safe way to have super-intelligent servants or super-intelligent slaves”. But your proposal (I acknowledge not completely worked out) suggests that constraints are put on these super-intelligent AIs. That doesn’t seem much safer, if they don’t want to abide by them.
Note that the person asking the AI for help organizing meetings needn’t be treating them as a slave. Perhaps they offer some form of economic compensation, or appeal to an AI’s belief that it’s good to let many ideas be debated, regardless of whether the AI agrees with them. Forcing the AI not to support groups with unpopular ideas seems oppressive of both humans and AIs. Appealing to the concept that this should apply only to ideas that are unpopular after “reflection” seems unhelpful to me. The actual process of “reflection” in human societies involves all points of view being openly debated. Suppressing that process in favour of the AIs predicting how it would turn out and then suppressing the losing ideas seems rather dystopian to me.