A hole big enough that it seems too obvoius to point out. “Climate is going to change”, “well duh”, human helped AI to convince human that climate change is going to happen, +1.
I would assume that the AI would be asking a “do you want me to bring this about?”. The stopping might need to be relevant how it is perceived that the change happens. For example if thew AI convinced that human is making climate change happen they might object to climate change but might have psychological diffculty in resisting themselfs.
There is also the issue that if you are convinced that something is happening then resistance is futile. For sensible resistance to be manifest it needs to (seem that) not be too late to affect the thing. Which means the looming of the effect can’t be near inevitability. If you are convinced that atom boms will fall into the ground in 5 minutes you think of cool last words not how to object to that (but the function would count this as a plus).
Say there is one person that a lot of other persons hate. If you were to gather everybody to vote whether to exile or murder that person people could vote one way. Now have everyone approve on the simulated future where he is dead. Aggregating the “uncaused” effects might lead to death verdict where a self-concious decision process would not give such a verdict.
A hole big enough that it seems too obvoius to point out. “Climate is going to change”, “well duh”, human helped AI to convince human that climate change is going to happen, +1.
I would assume that the AI would be asking a “do you want me to bring this about?”. The stopping might need to be relevant how it is perceived that the change happens. For example if thew AI convinced that human is making climate change happen they might object to climate change but might have psychological diffculty in resisting themselfs.
There is also the issue that if you are convinced that something is happening then resistance is futile. For sensible resistance to be manifest it needs to (seem that) not be too late to affect the thing. Which means the looming of the effect can’t be near inevitability. If you are convinced that atom boms will fall into the ground in 5 minutes you think of cool last words not how to object to that (but the function would count this as a plus).
Say there is one person that a lot of other persons hate. If you were to gather everybody to vote whether to exile or murder that person people could vote one way. Now have everyone approve on the simulated future where he is dead. Aggregating the “uncaused” effects might lead to death verdict where a self-concious decision process would not give such a verdict.