Slider comments on Can “Reward Economics” solve AI Alignment?

Slider 7 Sep 2022 22:41 UTC
3 points
0
I see a constellation of musings which seem somewhat promising but I can’t really comprehend it as an idea. I can not state in my own words what you mean.
I thought that system level reasoning vs some old way of doing things was pretty important and now it seems its only a minor detail.
It would seem that “traditionally” we have “moral systems”, “law systems” or “strategy systems” and then we improve on this by using “money systems”. But these words are used in an abstracted sense or have additional meanings that those words do not usually have so it becomes extremely hard to pinpoint what is meant.
- Q Home 8 Sep 2022 0:40 UTC
  1 point
  0
  Parent
  It would seem that “traditionally” we have “moral systems”, “law systems” or “strategy systems” and then we improve on this by using “money systems”. But these words are used in an abstracted sense or have additional meanings that those words do not usually have so it becomes extremely hard to pinpoint what is meant.
  I tried to formulate my idea “in a few words” in this part of the post: Alignment. Recap
  You can split possible effects of AI’s actions into three domains. All of them are different (with different ideas), even though they partially intersect and can be formulated in terms of each other. Traditionally we focus on the first two domains:
  1. (Not) accomplishing a goal. Utility functions are about this.
  2. (Not) violating human values. Models of human feedback are about this.
  3. (Not) modifying a system without breaking it. Impact measures are about this.
  My idea is about combining all of this (mostly 2 and 3) into a single approach. Or generalizing ideas for the third domain. There isn’t a lot of ideas for the third one, as far as I know. Maybe people are not aware enough about that domain.
  I know that it’s confusing, I struggled to formulate the difference myself. But if you realize the difference between the 3 domains everything should become clear. “Human values vs. laws of a society” may be a good analogy for the difference between 2 and 3: those two things are not equivalent even though they intersect and can be formulated in terms of each other.
  I thought that system level reasoning vs some old way of doing things was pretty important and now it seems its only a minor detail.
  I believe there’s a difference, but the difference isn’t about complexity. Complexity of reasoning doesn’t depend on your goals or “code of conduct”.