But what happens when the low-intensity conversation and the brainwashing are the same thing?
That’s definitely bad in cases where people explicitly care about goal preservation. But only self-proclaimed consequentialists do.
The other cases are more fuzzy. Memeplexes like rationality, EA/utilitarianism, religious fundamentalism, political activism, or Ayn Rand type stuff, are constantly ‘radicalizing’ people, turning them from something sort-of-agenty-but-not-really into self-proclaimed consequentialist agents. Whether that is in line with people’s ‘real’ desires is to a large extent up for interpretation, though there are extreme cases where the answer seems clearly ‘no.’ Insofar as recruiting strategies are concerned, we can at least condemn propaganda and brain washing because they are negative-sum (but the lines might again be blurry).
It is interesting that people don’t turn into self-proclaimed consequentialists on their own without the influence of ‘aggressive’ memes. This just goes to show that humans aren’t agents by nature, and that an endeavor of “extrapolating your true consequentialist preferences” is at least partially about adding stuff that wasn’t previously there rather than discovering something that was hidden. That might be fine, but we should be careful to not unquestioningly assume that this automatically qualifies as “doing people a favor.” This, too, is up for interpretation to at least some extent. The argument for it being a favor is presented nicely here. The counterargument is that satisficers often seem pretty happy and who are we to maneuver them into a situation where they cannot escape their own goals and always live for the future instead of the now. (Technically people can just choose whichever consequentialist goal that is best fulfilled with satisficing, but I could imagine that many preference extrapolation processes are set up in a way that make this an unlikely outcome. For me at least, learning more about philosophy automatically closed some doors.)
“extrapolating your true consequentialist preferences” is at least partially about adding stuff that wasn’t previously there rather than discovering something that was hidden.
Yes yes yes, this is a point I make often. Finding true preferences is not just a learning process, and cannot be reduced to a learning process.
As for why it needs to be done… well, for all the designs like Inverse Reinforcement Learning that involve AIs learning human preferences, it has to be done adequately if those are to work at all.
It is interesting that people don’t turn into self-proclaimed consequentialists on their own without the influence of ‘aggressive’ memes.
Why do you think so? It’s not self-evident to me. Consequentialism is… strongly endorsed by evolution. If dying is easy (as it was for most of human history), not being a consequentialist is dangerous.
I agree with this, except that I would go farther and add that if we had a superintelligent AI correctly calculate our “extrapolated preferences,” they would precisely include not being made into agents.
That’s definitely bad in cases where people explicitly care about goal preservation. But only self-proclaimed consequentialists do.
The other cases are more fuzzy. Memeplexes like rationality, EA/utilitarianism, religious fundamentalism, political activism, or Ayn Rand type stuff, are constantly ‘radicalizing’ people, turning them from something sort-of-agenty-but-not-really into self-proclaimed consequentialist agents. Whether that is in line with people’s ‘real’ desires is to a large extent up for interpretation, though there are extreme cases where the answer seems clearly ‘no.’ Insofar as recruiting strategies are concerned, we can at least condemn propaganda and brain washing because they are negative-sum (but the lines might again be blurry).
It is interesting that people don’t turn into self-proclaimed consequentialists on their own without the influence of ‘aggressive’ memes. This just goes to show that humans aren’t agents by nature, and that an endeavor of “extrapolating your true consequentialist preferences” is at least partially about adding stuff that wasn’t previously there rather than discovering something that was hidden. That might be fine, but we should be careful to not unquestioningly assume that this automatically qualifies as “doing people a favor.” This, too, is up for interpretation to at least some extent. The argument for it being a favor is presented nicely here. The counterargument is that satisficers often seem pretty happy and who are we to maneuver them into a situation where they cannot escape their own goals and always live for the future instead of the now. (Technically people can just choose whichever consequentialist goal that is best fulfilled with satisficing, but I could imagine that many preference extrapolation processes are set up in a way that make this an unlikely outcome. For me at least, learning more about philosophy automatically closed some doors.)
Yes yes yes, this is a point I make often. Finding true preferences is not just a learning process, and cannot be reduced to a learning process.
As for why it needs to be done… well, for all the designs like Inverse Reinforcement Learning that involve AIs learning human preferences, it has to be done adequately if those are to work at all.
Why do you think so? It’s not self-evident to me. Consequentialism is… strongly endorsed by evolution. If dying is easy (as it was for most of human history), not being a consequentialist is dangerous.
I agree with this, except that I would go farther and add that if we had a superintelligent AI correctly calculate our “extrapolated preferences,” they would precisely include not being made into agents.