Gordon Seidoh Worley comments on Curiosity about Aligning Values

Gordon Seidoh Worley 3 Mar 2021 2:21 UTC
3 points
Here’s my question: what if instead of fixed concepts and rules, AI alignment focused on actions as the underlying reward function? In other words, might programming AI to focus on the means rather than the ends facilitate an environment in which humans are freer to act and reach their own ends, prioritizing activated potential over predetermined outcome? Can the action, instead of the outcome, become the parameter, rendering AI a facilitator rather than a determiner?
If I’m understanding your right, which I’m not sure I am, I think this just collapses back to the normal case but where the thing being optimized for are those that you demarcate as “means” rather than “ends”. That is, the means literally become the ends because they are the things being optimized for.
- esweet 9 Mar 2021 4:57 UTC
  1 point
  Parent
  I think you are understanding correctly and I see your point. So the question becomes: we intervene before it becomes cyclical so that the focus is process and not outcome? That’s where the means and the ends remain separate. In effect, can a non-deterministic AI model be written?