A Ray comments on What’s Up With Confusingly Pervasive Goal Directedness?

A Ray 20 Jan 2022 21:37 UTC
2 points
Saying this again separately, if you taboo ‘consequentialism’ and take these as the definitions for a concept:
“the thing to do is choose actions that have the consequences you want.”
A ___ is something that thinks, predicts, and plans (and, if possible, acts) in such a way as to bring about particular consequences.
I think this is what “the majority of alignment researchers who probably are less on-the-ball” are in fact thinking about quite often.
We just don’t call it ‘consequentialism’.
- Raemon 20 Jan 2022 22:02 UTC
  2 points
  Parent
  does it have a name, or just a vaguely amorphous concept blob?
  - Dweomite 20 Jan 2022 22:33 UTC
    9 points
    Parent
    Goal-directed?
    - A Ray 20 Jan 2022 23:35 UTC
      3 points
      Parent
      I like this one. I think it does a lot to capture both the concept and the problem.
      The concept is that we expect AI systems to be convergently goal-directed.
      The problem is that people in AI research often uncertain about goal-directeness and its emergence in advanced AI systems. (My attempt to paraphrase the problem of the post, in terms of goal-directedness, at least)
  - A Ray 20 Jan 2022 22:13 UTC
    1 point
    Parent
    Nothing comes to mind as a single term, in particular because I usually think of ‘thinking’, ‘predicting’, and ‘planning’ separately.
    If you’re okay with multiple terms, ‘thinking, predicting, and planning’.
    Aside: now’s a great time to potentially rewrite the LW tag header on consequentialism to match this meaning/framing. (Would probably help with aligning people on this site, at least). https://www.lesswrong.com/tag/consequentialism