I like this one. I think it does a lot to capture both the concept and the problem.
The concept is that we expect AI systems to be convergently goal-directed.
The problem is that people in AI research often uncertain about goal-directeness and its emergence in advanced AI systems. (My attempt to paraphrase the problem of the post, in terms of goal-directedness, at least)
Nothing comes to mind as a single term, in particular because I usually think of ‘thinking’, ‘predicting’, and ‘planning’ separately.
If you’re okay with multiple terms, ‘thinking, predicting, and planning’.
Aside: now’s a great time to potentially rewrite the LW tag header on consequentialism to match this meaning/framing. (Would probably help with aligning people on this site, at least). https://www.lesswrong.com/tag/consequentialism
Saying this again separately, if you taboo ‘consequentialism’ and take these as the definitions for a concept:
I think this is what “the majority of alignment researchers who probably are less on-the-ball” are in fact thinking about quite often.
We just don’t call it ‘consequentialism’.
does it have a name, or just a vaguely amorphous concept blob?
Goal-directed?
I like this one. I think it does a lot to capture both the concept and the problem.
The concept is that we expect AI systems to be convergently goal-directed.
The problem is that people in AI research often uncertain about goal-directeness and its emergence in advanced AI systems. (My attempt to paraphrase the problem of the post, in terms of goal-directedness, at least)
Nothing comes to mind as a single term, in particular because I usually think of ‘thinking’, ‘predicting’, and ‘planning’ separately.
If you’re okay with multiple terms, ‘thinking, predicting, and planning’.
Aside: now’s a great time to potentially rewrite the LW tag header on consequentialism to match this meaning/framing. (Would probably help with aligning people on this site, at least). https://www.lesswrong.com/tag/consequentialism