deepthoughtlife comments on Help Understanding Preferences And Evil

deepthoughtlife 27 Aug 2022 14:10 UTC
1 point
−2
If something is capable of fulfilling human preferences in its actions, and you can convince it to do so, you’re already most of the way to getting it to do things humans will judge as positive. Then you only need to specify which preferences are to be considered good in an equally compelling manner. This is obviously a matter of much debate, but it’s an arena we know a lot about operating in. We teach children these things all the time.
- Netcentrica 27 Aug 2022 17:02 UTC
  1 point
  0
  Parent
  Stuart does say something along the same lines that you point out in a later chapter however I felt it detracted from his idea of three principles:
  1. The machine’s only objective is to maximize the realization of human preferences.
  2. The machine is initially uncertain about what those preferences are.
  3. The ultimate source of information about human preferences is human behavior.
  He goes on at such length to qualify and add special cases that the word “ultimate” in principle #3 seems to have been a poor choice because it becomes so watered down as to lose its authority.
  If things like laws, ethics and morality are used to constrain what AI learns from preferences (which seems both sensible and necessary as in the parent/child example you provide) then I don’t see how preferences are “the ultimate source of information” but rather simply one of many training streams. I don’t see that his point #3 itself deals with the issue of evil.
  As you point out this whole area is “a matter of much debate” and I’m pretty confident that like philosophical discussions it will go on (as they should) forever however I am not entirely confident that Stuart’s model won’t end up having the same fate as Marvin Minsky’s “Society Of Mind”.
  - deepthoughtlife 28 Aug 2022 0:00 UTC
    1 point
    0
    Parent
    That does sound problematic for his views if he actually holds these positions. I am not really familiar with him, even though he did write the textbook for my class on AI (third edition) back when I was in college. At that point, there wasn’t much on the now current techniques and I don’t remember him talking about this sort of thing (though we might simply have skipped such a section).
    You could consider it that we have preferences on our preferences too. It’s a bit too self-referential, but that’s actually a key part of being a person. You could determine those things that we consider to ‘right’ directly from how we act when knowingly pursuing those objectives, though this requires much more insight.
    You’re right, the debate will keep going on in philosophical style, but if it works or not as an approach for something different than humans could change that.